Build and manage AI applications easily without needing to move your data to complex pipelines and specialized vector databases. Integrate AI and vector search directly with your database including real-time inference and model training. Just using Python!
from superduperdb import superduper
from superduperdb.mongodb import Collection
import pymongo
my_db = superduper(pymongo.MongoClient().my_db)
r = my_db.execute(
Collection('docs')
.like({'txt': 'similar to this'})
.find_one()
)
Build AI applications on top of your datastore
A single scalable deployment of all your AI models and APIs which is automatically kept up-to-date as new data is processed immediately.
Turn your existing database into a fully-functional vector database
No need to introduce an additional database and duplicate your data to use vector search and build on top of it. SuperDuperDB enables vector search in your existing database.
from superduperdb import Listener, VectorIndex
from superduperdb.ext.openai import OpenAIEmbedding
db.add(
VectorIndex(
identifier='my-index',
indexing_listener=Listener(
model=OpenAIEmbedding(model='text-embedding-ada-002'),
key='txt',
select=Collection('documents').find(),
),
)
)
from superduperdb import superduper
from transformers import pipeline
model = superduper(pipeline('sentiment-analysis'))
model.fit(
X='review',
y='rating',
database=my_db,
select=Collection('docs').find({'rating': {'$exists': 1}})
)
model.predict(
X='review',
db=my_db,
select=Collection('docs').find()
) # make predictions on unseen data
Work with any ML/AI frameworks and APIs
Integrate and combine models from Sklearn, PyTorch, HuggingFace with AI APIs such as OpenAI to build even the most complex AI applications and workflows.
What you can do with SuperDuperDB:
Deploy all your AI models to automatically compute outputs (inference) in your datastore in a single environment with simple Python commands.
Train models on your data in your datastore simply by querying without additional ingestion and pre-processing.
Integrate AI APIs (such as OpenAI) to work together with other models on your data effortlessly.
Search your data with vector-search, including model management and serving.
from superduperdb import superduper
from sentence_transformers import SentenceTransformer
model = superduper(
SentenceTransformer('all-MiniLM-L6-v2')
)
model.predict(
X='input_col',
db=db,
select=Collection(name='test_documents')
)
Example Use-Cases
Check out AI use cases and applications that we have already implemented using open-source models and public APIs!
SuperDuperDB transforms your datastore into:
An end-to-end AI deployment which includes a model repository & registry as well as computation of outputs
A model trainer allowing you to easily train and fine-tune your models simply by querying your data(store)
A feature store in which the model outputs are stored in desired formats and types, instantly available in the datastore
A fully functional vector database enabling straightforward generation of vector embeddings on your data with your choice of models
Who is SuperDuperDB for?
Full Stack Developers
who want to implement next gen AI into their applications without MLOps knowledge required.
Data Scientists
who want to develop and train AI models using their favourite tools, with minimum overhead.
ML Engineers
who want want a single scalable setup that supports both local, on-prem and cloud deployment.
Why choose SuperDuperDB?
Avoid data duplication, convoluted pipelines and complex infrastructure with a single scalable deployment of all AI models and APIs
New data is processed automatically and immediately keeping the deployment always up-to-date
A simple and familiar Python interface that can handle even the most complex AI use-cases without requiring MLOps knowledge
Development and deployment of AI applications on your data is massively simplified
A developer experience tailored to AI
Work natively in Python
Apply AI with just a few simple commands, no MLOps experience required
Integration with ML/ AI frameworks and APIs
Native support of PyTorch, Sklearn and HuggingFace models as well as models externally hosted behind public APIs.
Complementary to the existing ecosystem
Based around flexible notions of data types, data stores, data retrieval and AI models, SuperDuperDB is super-easy to extend and customize.
#third code card
from superduperdb.ext.pillow import pil_image
from superduperdb.ext.torch import Tensor
db.execute(collection.insert_one({
'y': pil_image(y),
'x': Tensor(x, shape=x.shape)
})
#second code card
from superduperdb.ext.torch import TorchModel
from superduperdb.ext.transformers import TransformersModel
from superduperdb.ext.sklearn import SklearnModel
from superduperdb.ext.openai import OpenAIEmbedding
m1 = TorchModel('m1', object=MyTorchModule, preprocess=p1)
m2 = TransformersModel('m2', object=MyTorchModule, preprocess=p2)
m3 = SklearnModel('m3', object=MyTorchModule, preprocess=p3)
m4 = OpneAIEmbedding('m4', preprocess=p4)
#first code card
In [1]: from superduperdb import superduper
...: from superduperdb.mongodb import Collection
...: import pymongo
In [2]: my_db = superduper(pymongo.MongoClient().my_db)
In [3]: r = my_db.execute(Collection('docs').find_one())
Get started with SuperDuperDB
SuperDuperDB comes pre-loaded with all you need to supercharge your data with AI. It’s really that simple!