One of the challenges for developers who want to get into AI is that it's a substantial undertaking. A whole new set of tools to learn is available. It's hard to imagine "well, what is a neural network good for?" and "how would I train something on this data set?" and "when do I stop training and start deploying?"
Tools such as ChatGPT are great for programming when you have a specific task and are not familiar with a popular API. It can be difficult to drive ChatGPT to an exact solution when the problem statement is non-trivial, but it’s often quite useful for a first sketch of a feature or an interaction with a library. I’ve used it many times to help me with matplotlib, to help me with things in React, to write some animation code in JavaScript, and more. It has been useful for me when debugging PyTorch tensor issues as well.
In the future, software development will be more about systems design and less about code minutiae, and that’s probably a good thing for everyone.
As developers, it's great to have these co-pilot assistants for the development process, whether we are working with existing code or generating new code. But what about bringing AI into your applications? What if we want to predict house prices, when a piece of equipment is going to fail, or which customer to call next?
Well, check out this schematic:
There are two main touch between you and Featrix: first, the training data coming in, and second, your application making predictions by calling Featrix. When you use Featrix on your data, Featrix uses deep learning techniques to build a foundational model on your data. Featrix then trains specific task functions that we call neural functions. All of these capabilities are powered by multiple neural networks.
Neural networks are functions: they have an input, they do something, and there is an output. The catch is: we don't know what the function should actually do on the input to produce the outputs. To characterize the behavior we want, we need a set of example data inputs and data outputs. When we train the neural network, we are asking the computer to figure out a way to understand the relationships in that data such that we can make predictions of what the output would be on new inputs that we do not have in our training data.
A neural network is a trainable function.
As developers, we can do quite a bit. When we use a neural network, we call that prediction or inference. We can predict numerical values, such as when a component in a system might fail. We can predict the category of an object, such as whether a toy is a Lego or not. We can predict which item in a list the user is most likely to click on, or which item the user is least likely to have heard of.
The three categories of prediction are called regression (numerical outputs), classification (a categorical output), and a recommendation (picking something from a set). In all cases, we compute probabilities. The underlying network might be quite complicated (many layers and nodes) or it could be simple (fewer layers and nodes). Indeed there is a whole craft (science + art) of picking a network size and adjusting various parameters that are called hyperparameters.
There’s a catch in preparing the data itself: the neural network needs the data represented as numbers. If my data contains strings, whether they are long strings or short string labels, we have to encode the data. We might be missing values and we have to figure out what to do with those values–some people may drop the records from the data with missing values, other techniques include putting a synthetic value in–and these approaches have significant tradeoffs.
Oftentimes there are “hidden” features in the data. Perhaps we care about the day of the week when we have a bunch of timestamps–maybe we’re trying to predict the flow of retail customers at a grocery store. A different business, such as a restaurant, might have those same timestamps but care more about time of day more than day of week.
That’s one of the key reasons we built Featrix. For nearly every team working on predictive AI, the complexity of preparing data is so high that the team is vastly limited in what they can achieve.
The tradeoffs described above have statistical impact on the questions being asked and the predictions that answer those questions–this means a statistical background is often helpful to understand these tradeoffs. But most developers don’t have a statistical background.
Steve Jobs in the late 90s said that, as a developer, you can probably build a few “floors” of a building before it falls in on you. And so the key to building higher with ease is to start on a higher floor. His argument was that if Apple provided APIs to enable you to build from the 20th floor instead of the second floor, you can skip the steps required to get to the 20th floor and just focus on your application.
We agree. So we aim to give developers a big head start on all the other AI toolkits and runtimes out there. We simplify the problem: you provide inputs and Featrix will construct what we call neural functions. There’s a ton of deep learning AI and statistical tools we leverage to build the best possible neural functions on your data–and you don’t have to know anything about any of it. You don’t have to clean or process your data. You just give it to Featrix and let it do its thing.