Category Archives: Projects

Shiny-Hunting Pokémon Bot with Arduino and OpenCV (Soft Reset Method)

My sister and I worked on an electronics project during a holiday a few years back. We created a bot that helped us capture a shiny Pokémon—a creature that appears with very low probability—in the video game Pokémon Ultra Moon. Not only was it a fun a vacation project, but it also helped my sister gain exposure to programming and electronics concepts.

Here is a video about the experience:

Project: EcuaCines

EcuaCines.com

Features:

  • Displays up-to-date movie descriptions, trailers, and showtimes for all major movie theaters in Quito, Ecuador
  • Allows users to quickly compare movie times instead of loading each theater's web page
  • Works on mobile devices
  • Loads faster than any of the corresponding movie theaters' websites
  • Has at least one Easter egg

Technologies:

  • Zurb Foundation, jQuery, Font Awesome, and Box2DWeb for the front-end
  • PHP with Simple HTML DOM and MySQL for the back-end, with cron jobs and Google Page Speed optimizations
  • Photoshop for the logo design

Discussion:

The website automatically obtains all movie showtime information from each cinema's official website, and then it has to be able to show these showtimes grouped by movie or by cinema. While this may seem like a trivial task, it turned out to be an interesting algorithmic challenge.

Human Steps:

  1. Open each movie theater's website
  2. Recognize that "Superman: El hombre de acero", " EL HOMBRE DE ACERO", and "Hombre de Acero" all refer to the same movie.
  3. Copy and paste the movie title (pick one of the three variations) and description along with the showtimes for each theater into Ecua Cines's database

Robot Steps:

  1. Open each movie theater's website
  2.  Find a piece of text that represents the movie's title by traversing each website's html tags according to hard-coded directions.
  3. Recognize that "Superman: El hombre de acero", " EL HOMBRE DE ACERO", and "Hombre de Acero" all refer to the same movie.
  4. Traverse cinema website according to hard-coded rules to find the movie summary text. (we only need to do this once for each movie)
  5. Traverse  cinema website according to hard-coded rules to find the movie times. Get each of the times by matching pieces of text with one or two digits followed by a ":" or an "h" and followed by two more digits. Assume a 24-hour time format.
  6. Save the title, description, and showtimes for each movie as obtained in steps 3, 4, and 5 into Ecua Cines's database

How can we teach a robot to ignore the differences between the three strings of characters from step 3, but still differentiate these from other movie titles? This is what I ended up making the program do, and that has worked in practice:

  • Trim whitespace from the start and end of movie titles
  • Remove all accents from letters, e.g. turn all 'á's into 'a's (this accounts for the fact that many people prefer to capitalize "áéíóú" as "AEIOU" instead of "ÁÉÍÓÚ")
  • TURN THE MOVIE TITLES INTO ALL CAPS (this accounts for variations in capitalization)
  • Remove commonplace Spanish words like "EL," "LA," and "Y" (only keep 'important' words)
  • Match each movie title with the title from another theater with the shortest Levenshtein distance to this title, but requiring a maximum threshold to avoid false positives when there actually is no valid match. (this accounts for small misspellings and singular-plural variation e.g. "MONSTERS UNIVERSITY" vs. "MONSTER UNIVERSITY")
  • If the thresholded Levenshtein method finds no matches, match together movie titles that have a 8-or-more-letter common substring (there's a really cool dynamic programming algorithm for efficiently finding the greatest common substring of two strings)
  • If everything else fails, conclude that this movie is unique to this theater

Sprite Sheets

Even assuming a constant bandwidth speed, total image size isn't the only factor that determines how fast a website loads its images. It turns out that for each image that the page needs to load, it makes a request to the web server, so loading ten 42KB images individually will be slower than loading the same ten images put together into one larger, ~420KB image. Here's where sprite sheets come into play: by combining several images snugly into one image file and then cutting/masking the compound image into the original small images on the client side, we manage to make less requests to the server, obtaining a lower server load and allowing for faster download speeds.

Popcorn Spritesheet

I made this sprite sheet today for a project I'm working on (hint: it involves a physics engine).