BBrandon's Site
View on GitHub

Published on

Captions Made Easy

B

Brandon Maggiano

@bmaggiano
Captions Made Easy
igcap.dev

Upon graduating from the University of Arizona's full stack coding bootcamp, I was struggling to find opportunities for newly graduated students. In order to expand my skillset and gain experience, I decided to take on an unpaid internship with one of my best friends. This internship was a great opportunity to finally adapt some AI into my projects and learn a lot about the tech industry.


The Fork

Initially this project started off by being a fork from https://github.com/Nutlope/twitterbio repo to help users create twitter bios based on a prompt. This then evolved into an idea. What if I could take an image, a prompt, and help users create instagram captions based on different styles of popluar online presences. The fork was easy, the rest was a bit more involved. As this was the first AI project that I was going to work on, I really had to understand the OpenAI API model and how to use it. I also had to understand more about image urls and how it would even be possible to upload an image and get a response back depecting what was in that image. To my noob, post grad, mind... this seemed impossible. Luckily I had a friend in the field who held my hand through the process.


Success

After about a week or so of building, I was finally able to use a variety of tech and storage solutions to get this thing working. One of the biggest issues we faced was how to store and display that image back to the user to let them know their image had been successfully uploaded. There were some ideas:

  • Use local storage to store the image as a base64 string
  • Use a database to store the image and the prompt
  • Use a file system to store the image and the prompt
  • Use an upcoming tech called Uploadthings to store the image and the prompt

We decided to try local storage at first, but the strings were actually too long and it ended up impossible to store them there. So eventually we settled with trying https://uploadthing.com/. Built to be a better S3, this seemed like a quick efficient solution. We were able to get the image and the prompt back to the user and display it in the UI. The user can then click the "Generate Caption" button and see the caption generated by the AI. This was a huge success and I'm really proud of the work that I did.

Looking forward

As the technology for image recognition and AI in general continues to improve, there is ample opportunity to update these projects to make them ever better and more sophisticated. I'm excited to see what the future holds for these projects and how they can continue to evolve and improve.