Trust and AI: How Featrix Brings Trust to AI Results
Where does trust come from?
Trust comes from reliability, transparency, and shared understanding. People trust AI systems that consistently perform as expected, explain their decisions clearly, and align with ethical values. By demonstrating accountability, providing user control, and showcasing real-world success, organizations can foster confidence in their AI solutions.
Trusting the components of an AI system
There are 3 main components of any AI system: the input data, the model itself and the structure of its outputs, and the output of that model.
Trusting Your Input Data
With most AI systems, there are two main input knobs: one is the data and its encoding, and the second is the hyperparameters for the model that require tuning specifically for the AI application.
With Featrix, we do things a bit differently: the two main knobs are your data and how long you let Featrix train on your data. There are no hyperparameters; to adjust the model’s behavior, you tune the data, not magical knobs with unknown outcomes.
Traditional AI systems require cleaning and encoding data. These processes often entail fragile transformations on data that may break at some point in the future, and ensuring those transformations are happening at inference time, too. Featrix works by encoding the raw symbols your data has and doing the same encoding at inference time. You don’t need to clean your data or fix noisy symbols; you can train Featrix on the raw data.
Trusting Featrix Models
Featrix models have two parts: one is the embedding space which takes in raw data and converts the raw data into a vector embedding; the second is the predictive model, which we call a neural function.
The embedding space is a trained deep neural network that is trained on your data, resulting in a compact and extremely powerful representation of your data. The embedding can contextualize conditional relationships in your data, resulting in powerful models built on top of the embedding that are quite simple–because the information is contextualized and maximized in the embedding itself.
We have named the models built on the Featrix embedding space neural functions to make them easy to conceptualize for developers–using a Featrix model (i.e., making a function call) invokes the embedding space neural network to encode the data and the downstream neural network that can predict a target column, whether for classification or regression.
Trusting the Embedding Space
The embedding space is core to everything Featrix does. When we compute the embeddings, we actually do two embeddings: one is three dimensional for visualization and one is higher dimensional for the actual vector itself. We do this to avoid introducing distortion that PCA or t-SNE can introduce when down projecting higher dimensional embeddings to 3d space for visualization.
So when you view the 3d embeddings sphere in the Featrix UI, you are not seeing any distortion introduced by visualization techniques, though of course it is not possible to capture all the subtleties of the higher dimensional representations in the 3d model.
Our Embeddings Explorer lets you see clusters and evaluate the organization of the original data and their proximity in the transformed space. In particular, the goal of the embedding function is to carry the property of proximity from the original space into the embedding space. In other words, data that are close together before embedding should be closed together after embedding. We provide mouse overs and you can query the data to show specific values in the embeddings explorer, letting you evaluate the embedding space constructed on your data.
Trusting the Predictions
Our primary means of trusting the predictions comes from the large result structure we return on every query, which contains key information about trusting the prediction from the model. In particular, we include:
- An echo of the query that came in, to eliminate any confusion of mapping associations of results and inputs by index
- If the query included additional fields or misspelled field names, those fields will be listed in the ignored_query_columns list; the actual query used by the model is found in actual_query.
- The list of columns the model was trained on appears in avaliable_query_columns, which can highlight mismatches between the model and inference information.
- The results are contained in the results section. For classification, this is the probability distribution of all the possible categories. For regression, this will be a scalar.
- If applicable, query_column_guardrails will be populated with information about unknown symbols used on a categorical input, or if a scalar is vastly out of bounds with the training data (more than 4 standard deviations from the mean).
This additional metadata is present for every query, which makes tracking results robust and clear.
An End-to-End Example
Let's take a look at an example data set. This is a data set that has been thoroughly examined by the AI community since 1994; it is called credit-g. This is an easy data set to understand; each row is an applicant applying for a loan and we want to classify the application as yes/no for making the loan (target = "good" or "bad").
Here's how the data schema appears when we load it into Featrix. We can look at the sample values in Featrix and verify that Featrix has picked appropriate types for each column.
Once the data is loaded, we can train a model in just a few clicks. After the model is built, we get access to standard ML metrics.
Now we can access the "prediction sandbox" and try queries on our model right from the browser.
On the left panel, we can experiment with our inputs, and on the right, we see Featrix's JSON payload resulting from the prediction, including all the metadata mentioned above.
Of course, we can access all of this information in the Featrix API. When we call Featrix for predictions, we get all this information back to in our application. This lets us potentially show the user an error or raise an alert if we get errors or hit guardrails we didn't expect to hit.
More Trust Issues
There are many more topics to explore with AI and trust, such as the ethical use of data, bias in models, observability of the system. All of these trust issues can be addressed with Featrix tooling and clear applications of your data and AI models. Let us know if we can help!
Recap: Featrix Enables Trust in AI
This figure shows how Featrix modules work together to achieve the goal of trustworthy AI. Every layer of the Featrix system, from input data, to models, to predictions are implemented with trust and safety in mind.
Learn
AI has a lot of jargon. Get our AI Primer poster as a PDF or as a hard copy.
What's next?
How do we recommend you actually get started? If you don't have a clear vision how predictive analytics applies to your industry, you can get inspiration from our blog series on industries, find the "Industries" ones on our blog page.
Ready to dive in? Our documentation provides more detailed step-by-step instructions on key steps in the workflow, including uploading data and training a neural predictor. Check out the documentation index for information on additional topics.
How can we help?
Reach out via hello@featrix.ai or schedule time to meet with a developer on our team.