SuperDuperDB: Bring AI to your favorite database!

PYTHON

Apache 2.0

Build and manage AI applications easily without needing to move your data to complex pipelines and specialized vector databases. Integrate AI and vector search directly with your database including real-time inference and model training. Just using Python!

Build AI applications on top of your datastore

A single scalable deployment of all your AI models and APIs which is automatically kept up-to-date as new data is processed immediately.

Turn your existing database into a fully-functional vector database

No need to introduce an additional database and duplicate your data to use vector search and build on top of it. SuperDuperDB enables vector search in your existing database.

from superduperdb import superduper 
from transformers import pipeline
  
model = superduper(pipeline('sentiment-analysis'))
model.fit( 
    X='review', 
    y='rating', 
    database=my_db, 
    select=Collection('docs').find({'rating': {'$exists': 1}})
)
model.predict(
    X='review', 
    db=my_db, 
    select=Collection('docs').find()
)    # make predictions on unseen data

Work with any ML/AI frameworks and APIs

Integrate and combine models from Sklearn, PyTorch, HuggingFace with AI APIs such as OpenAI to build even the most complex AI applications and workflows.

What you can do with SuperDuperDB:

Deploy all your AI models to automatically compute outputs (inference) in your datastore in a single environment with simple Python commands.

Train models on your data in your datastore simply by querying without additional ingestion and pre-processing.

Integrate AI APIs (such as OpenAI) to work together with other models on your data effortlessly.

Search your data with vector-search, including model management and serving.

Deploy & Compute

Train Models

Integrate APIs

Vector Search

from superduperdb import superduper 
from sentence_transformers import SentenceTransformer

model = superduper(
    SentenceTransformer('all-MiniLM-L6-v2')
)

model.predict(
    X='input_col',
    db=db,
    select=Collection(name='test_documents')
)

Example Use-Cases

Check out AI use cases and applications that we have already implemented using open-source models and public APIs!

SuperDuperDB transforms your datastore into:

🔮

An end-to-end AI deployment which includes a model repository & registry as well as computation of outputs

🧠

A model trainer allowing you to easily train and fine-tune your models simply by querying your data(store)

🛒

A feature store in which the model outputs are stored in desired formats and types, instantly available in the datastore

📦

A fully functional vector database enabling straightforward generation of vector embeddings on your data with your choice of models

Data LayerMongoDB (Atlas), S3, PostgreSQL, MySQL, DuckDB, SQLite, BigQuery, Snowflake

Self-hosted ModelsLLaMA, Dolly, Clip, Stable Diffusion, and more + custom.

AI APIsOpenAI, Cohere AI, and more.

AI framework & LibrariesPytorch, Tensorflow, Sklearn, HuggingFace, Keras, and more.

ML ToolingWeights & Biases, MLFlow, Tensorboard, and more.

Who is SuperDuperDB for?

Full Stack Developers

who want to implement next gen AI into their applications without MLOps knowledge required.

Data Scientists

who want to develop and train AI models using their favourite tools, with minimum overhead.

ML Engineers

who want want a single scalable setup that supports both local, on-prem and cloud deployment.

Why choose SuperDuperDB?

⭐️

Avoid data duplication, convoluted pipelines and complex infrastructure with a single scalable deployment of all AI models and APIs

⭐️

New data is processed automatically and immediately keeping the deployment always up-to-date

⭐️

A simple and familiar Python interface that can handle even the most complex AI use-cases without requiring MLOps knowledge

⭐️

Development and deployment of AI applications on your data is massively simplified

A developer experience tailored to AI

Work natively in Python

Apply AI with just a few simple commands, no MLOps experience required

Integration with ML/ AI frameworks and APIs

Native support of PyTorch, Sklearn and HuggingFace models as well as models externally hosted behind public APIs.

Complementary to the existing ecosystem

Based around flexible notions of data types, data stores, data retrieval and AI models, SuperDuperDB is super-easy to extend and customize.

#third code card
from superduperdb.ext.pillow import pil_image
from superduperdb.ext.torch import Tensor
  
db.execute(collection.insert_one({
    'y': pil_image(y),
    'x': Tensor(x, shape=x.shape)
})

#second code card
from superduperdb.ext.torch import TorchModel
from superduperdb.ext.transformers import TransformersModel
from superduperdb.ext.sklearn import SklearnModel
from superduperdb.ext.openai import OpenAIEmbedding
  
m1 = TorchModel('m1', object=MyTorchModule, preprocess=p1)
m2 = TransformersModel('m2', object=MyTorchModule, preprocess=p2)
m3 = SklearnModel('m3', object=MyTorchModule, preprocess=p3)
m4 = OpneAIEmbedding('m4', preprocess=p4)

#first code card
In [1]: from superduperdb import superduper 
   ...: from superduperdb.mongodb import Collection 
   ...: import pymongo 

In [2]: my_db = superduper(pymongo.MongoClient().my_db) 

In [3]: r = my_db.execute(Collection('docs').find_one())

🚀

Get started with SuperDuperDB

SuperDuperDB comes pre-loaded with all you need to supercharge your data with AI. It’s really that simple!