I'm currently pursuing a PhD at Dartmouth in the CS Department. My research focuses on computer vision, artificial intelligence, and robotics. I'm particularly interested in exploring how computers, robots, and artificial agents can perceive and reason about the world in a way similar to humans.
Some things I'm interested in:
- Computer Vision
- Video Understanding
- Deep Learning/Machine Learning
- High perfomance computing
- Mathematics
Some personal facts:
- Dog lover: Tala, Luna, Raissa & Naia
- I like to travel around the world
- Audiophile & amateur musician
- Water sports: It's always better to be near the sea
- Sometimes I'm a geek and like to do reverse-engineering
- I founded Wiqonn a tech startup based in Latin America, specializing in Artificial Intelligence, data analysis, and software development
Contact:
- Academic Email: [email protected]
- Personal Email: [email protected]
- Business Email: [email protected]
- Github: https://github.com/waybarrios
- Google Scholar: wayner-gscholar
Projects:
- ActivityNet: ActivityNet can be used to compare algorithms for human activity understanding: untrimmed activity classification and activity detection and temporal localization. I developed algorithms and interfaces for: collecting candidate videos, filtering and temporal annotation tools, QA process, building dataset and evaluation server
- Worldmap: Harvard Worldmap is an online and open source mapping platform developed to lower barriers for scholars who wish to explore, visualize, edit, and publish geospatial information. I was dedicated on: Large-scale Workloads optimization for production, code developing for visualization and storing big geospatial information and migration processes
- GeoNode: GeoNode is an open source platform that facilitates the creation, sharing, and collaborative use of geospatial data. I was dedicated on: service virtualization and inter-service networking communications
- CAN Bus: Exploring a reverse-engineering using a Ford Fusion CAN Bus protocol. It is a communication protocol designed to facilitate reliable and prioritized exchange of messages among Electronic Control Units (ECUs) present in modern vehicles.
- AR Drone C++: This is C++ Library for controlling AR-DRONE. It does not depend on SDK from parrot. The classes are based on work done in java
- DQN examples: I developed some DQN examples from a Python Barranquilla meetup.
- VivaMed - HUMATH: Telemedicine pneumonia detection: VivaMed/HUMATH addresses problems derived from COVID-19 and helps physicians to detect COVID-19 using medical imaging in colombian high-level hospitals. Sponsored by Colombian Ministry of Science, Technology and Innovation. My tasks were based on serving Machine Learning (ML) models in production, computer vision advising, data privacy and system architecture.
- K3D-nginx: I explored k3d kubernetes using nginx ingress controller
- Deep Intermodal Video Analytics: The objective of the DIVA program was to create reliable automated detection of activities in a streaming video setup with multiple cameras. It received full financial backing from IARPA and NIST. Within this project, I was responsible for designing computer vision algorithms capable of detecting and recognizing intricate activities and events in surveillance videos, taking into account the diverse viewpoints of both overlapping and non-overlapping camera angles. I participated on ActEv Challenge held by CVPR'19 ActivityNet workshop (CMU team). Code is located on this repository
- Guidance Based Video Grounding.: The official implementation of the paper: "Localizing Moments in Long Video Via Multimodal Guidance". In this repository, we provide the predicted scores from the Guidance Model using MAD Dataset. This paper was accepeted at "ICCV 2023".
- Multi-layer Learnable Attention Mask for Multimodal Tasks: Self-Attention mechanism in Transformer models faces challenges in diverse settings due to varying token granularity and high computational demands. The Learnable Attention Mask (LAM) addresses these by optimizing attention maps and focusing on critical tokens, enhancing performance across multiple layers of BERT-like networks. Experiments on datasets like MADv2 and ImageNet 1K show LAM’s effectiveness in improving performance and reducing redundancy. This innovation advances understanding in complex tasks, including movie analysis.
Publications:
Check my google-scholar for more details.