Chris Kok

Collins Aerospace is one of the world's largest suppliers of aerospace and defense products. I’m an ML Engineer on the Foundational AI team. We’re focused on supporting AI/ML contracts and projects related to emerging technologies in aerospace.

I can’t share too much about the exact products we’re creating (patent pending and all that), but I’d love to talk about the problems we’re tackling and the technology behind their solutions. I’m going over quite a few here, but the other tiles in my portfolio are more focused on individual projects - promise! Some projects I'm working on:

Crowdsourced Weather

Leading a million dollar contract to develop a event-driven, cloud-based data streaming platform for crowdsourcing weather information. Processes around 10K weather camera images upon update by the FAA (typically every few minutes). The latest containerized application is built with Spark, Docker, Cassandra, S3, EC2, and Flask.

Global Weather Forecasting

Integrating U-Net Convolutional Neural Networks with geospatial python libraries to forecast global precipitation within minutes. This process currently takes hours by the physical models being run on supercomputers.

Understanding Human Behavior

Combining few-shot meta-learning, knowledge graphs, and decision making techniques to create a holistic web-based system for understanding human actions to aid the optimization and support of airplane cabins and cockpits.

Community of Practice (CoP)

Leading the ML/AI CoP - the largest active CoP with 1000+ members and 300+ monthly attendees. Establishing connections and aligning goals of ML/AI leaders across Collins Aerospace, Raytheon, and UTC.

Using natural language processing and knowledge graphs to efficiently personalize anime recommendations. The Rekku API is currently deployed on a large anime recommendation website with around 80,000 unique monthly users. See it live on RandomAnime

The Problem

My friends and I were searching for some new anime to watch and realized that all of the trusted online sources were either word-of-mouth reviews or using basic filtering techniques; aside from Crunchyroll and Netflix - who only recommend their own anime. Thus, we set out on a journey to create our own system to provide personalized anime recommendations regardless of the hosting platform.

Data Collection

We started by looking for available data. There was a pretty popular anime API from MyAnimeList that we utilized at first. It provided some basic information about each anime like the genre, release date, and ratings. Through user testing, we found the need for deeper sources of data like specifics about anime characters and their tags - which we later web scraped through Anime Planet.

Data Preprocessing

We created a pipeline to collect all the new data we needed each season. The preprocessing mainly came from making sure that the anime from MAL and characters from AP matched up. There was also a lot of missing data to clean up and many outliers that were usually from different forms of anime - like Chinese webtoons or anime movies.

Data Storage

Working with a micro EC2 instance (1 GB RAM, 8 GB Memory), we really had to scale down our processing and storage to keep things running quickly. We stored everything in tables initially and later realized that the data we had made much more sense as connected graphs; between that and some data structures & algorithms magic, the API requests fell from 4 seconds to 0.1 seconds!

Machine Learning API

We tried many different algorithms and landed on a few that made the most sense to us and our user-tested audience. Some noteworthy ones include using word2vec for recommendations based on user reviews, regression based on key factors (found through user testing and validation), soft clustering based on pairs of variables.

Validation

We created a validation testing pipeline as well to make sure that the algorithms we produced were worthwhile. To judge the accuracy of the models, we used the manual user recommendations that were on MAL (we had to scrape this since it wasn’t provided with their API).

User Testing

User Testing: It has to be said that, before our integration with RandomAnime, we worked with an amazing frontend developer (Joel Wong) and an awesome UX designer (Joey Wong) to get a MVP running and demonstrable to users for usability testing. This proved to be one of the most important things for our project, getting it into the hands of users. They gave us crucial feedback which we were later able to keep reiterating on.

Deployment

As mentioned earlier, we’re using a micro AWS EC2 instance to deploy the API. We specifically set out to utilize Docker for containerization and, thus, it was pretty effortless to get up and running from there! Check out the deployed API's documentation.

Integration

After looking through the competitors, we chose RandomAnime to integrate with. The person running it seemed incredibly genuine with his love for anime (through his Instagram, Facebook, and MyAnimeList posts). Furthermore, he remained very agnostic to the hosting platforms (even going so far as to include the not-so-legal viewing options). Those viewing options were crucial for us because it was one of the most requested features during user testing and he had collected that unique dataset personally over the years.

All that said, he was also one of the largest anime recommendations on the web so it couldn’t have hurt! We had some initial talks with him while we were finalizing our API, and in about a month, our system was being used by tens of thousands of users on his anime-like page!

The Future

Now that we have live users, we’re collecting data on their preferences and what they look for in recommendations. We intend to use this data to continue improving our algorithms moving forward.

GitHub

We created a platform for finding sustainable recipes that use seasonal and local produce based on your location and the time of year. Check out the Live Webapp!