AI is moving incredibly fast from concepts in research papers to features in everyday apps. One key driver is the ability to use foundational models via an API. Using hosted models (like Open AI’s GPT-3, a large language model (LLM)) we can use AI in apps without requiring an ML team to train and deploy custom models.
Out-of-the-box, LLMs are best suited for text-related features such as completion, translation, and summarization, making it easier to create AI-powered chatbots (Intercom), copywriting tools (Jasper, Copy.ai), and coding companions (CoPilot, TabNine).
However, what if our apps have non-text use cases? For example, in RunwayML’s video editing software, users can simply type “claymation style,” and the clip’s style is automatically changed. This text-to-action interaction pattern allows users to access advanced functions easily, increasing user satisfaction and opening the app to a larger audience.
Say you’ve built a helpdesk app and added text-to-action capability. Now, users can type a command like “Assign all of Pedro’s open tickets to Paula” and be done—no need to run reports and click on individual tickets for 10 minutes.
However, building such features is no simple task. LLMs can make up things at times (hallucinate). They can’t make API calls or work with app data. Actions must have the option for user confirmation, and the ability to undo them easily is paramount. Research papers like ReAct show us ways to work with LLMs to overcome limitations. Implementing these techniques requires application developers to do a lot of heavy lifting — tuning prompts, coordinating multiple calls to the LLM, and maintaining memory.
Foundational AI models are becoming a new application platform, and with it comes the need for developer infrastructure to build upon them. You can read more about this on Madrona ventures blog (Foundation Models: The future (still) isn’t happening fast enough) and Astasia Myers’ (Quiet Capital) article (Foundational Model Orchestration (FOMO) — A Primer)
Some of the early projects in this space have been built on Python (Langchain) and Rust (Dust), but there hasn’t been an initiative in Ruby.
This is why we’re working on BoxCars.
BoxCars is a middleware designed to simplify the integration of AI functionality into Ruby on Rails applications. It aims to make it easy for Ruby developers to create powerful apps that provide an exceptional user experience. BoxCars allows developers to easily incorporate magical AI-powered features into their apps by abstracting complexity.
Today we are launching the initial release of BoxCars, our open-source gem. You can install it from RubyGems or access the source code on our GitHub repository. Refer to the Readme on the page to begin. We are continually developing documentation and anticipate having tutorials and guides accessible soon.
The key concept in BoxCars is the engine (the LLM) and BoxCars (the tools and interfaces you’d like to access.
Check out this code example demonstrating how BoxCars combines LLM actions with Google search and a calculator. With just these two BoxCars, you can build a wide variety of features!
# run a Train for a calculator, and search using default Engine
boxcars = [Boxcars::Calculator.new, Boxcars::Serp.new]
train = Boxcars.train.new(boxcars: boxcars)
puts train.run "What is pi times the square root of the average temperature \
in Austin TX in January?"
BoxCars is a new open-source project and welcomes your help. If you’re passionate about coding and want to be part of a community building the future of AI-powered apps, you should consider joining us on GitHub!
Additionally, we welcome any feedback that you have about BoxCars. Your input is crucial in helping us identify areas that need improvement and shape the future development of the gem. If you have any suggestions or ideas that you would like to share, please let us know!
Want these insights delivered? Subscribe below.