51 Common Libraries/ Packages needed to build AI/ LLM apps

Share it with your senior IT friends and colleagues
Reading Time: 3 minutes

Common Libraries/ Packages are needed to build AI/ LLM applications that we use but not many are telling their exact use.

  1. Tiktoken – A fast BPE (Byte Pair Encidong) tokeniser to use with OpenAI’s models
  2. Bitsandbytes – To enable large language models to be accessible via k-bit quantization for PyTorch.
  3. Accelerate – It is a library that enables the same PyTorch code to be run across any distributed configuration by adding just four lines of code! In short, training and inference at scale are simple, efficient, and adaptable.
  4. Duckduck Go Search – Search for words, documents, images, etc. using the DuckDuckGo.com search engine.
  5. Langchain – contains higher-level and use-case-specific Langchain components that are at the core of the application’s architecture.
  6. Langchain community – It contains all the third-party integrations. These integrations are ready to use in any LangChain application.
  7. Langchain_core – contains simple, core abstractions that have emerged as a standard, as well as LangChain Expression Language as a way to compose these components together.
  8. langchain_openai – This package contains the LangChain integrations for OpenAI
  9. Pypdf – A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files.
  1. Faiss-cpu – For efficient similarity search and clustering of dense vectors for CPU version
  1. Faiss-gpu – For efficient similarity search and clustering of dense vectors for GPU version
  2. Streamlit – Streamlit lets you transform Python scripts into interactive web apps in minutes
  3. Google-generativeai – The Google AI Python SDK is the easiest way for Python developers to build with the Gemini API.
  4. Python-dotenv – Read key-value pairs from a .env file and set them as environment variables
  1. Transformers – Transformers provides thousands of pre-trained models to perform tasks on different modalities such as text, vision, and audio
  2. Pdf2image – A wrapper around the command line tools to convert PDF to a PIL Image list.
  1. Chromadb – to access the open-source embedding database
  2. Pathlib – pathlib offers a set of classes to handle filesystem paths.
  3. Youtube_transcript_api – To get the transcripts/subtitles for a given YouTube video. It also works for automatically generated subtitles and supports translating subtitles.
  4. gTTS – (Google Text-to-Speech), a Python library and CLI tool to interface with Google Translate’s text-to-speech API.
  1. Langchain_google_genai – An integration package connecting Google’s Genai package and LangChain
  1. Gradio – Python library for easily interacting with trained machine learning models
  1. Google-search-results – Scrape and search localized results from Google, Bing, Baidu, Yahoo, Yandex, eBay, Homedepot, YouTube at scale using SerpApi.com
  1. Sentence_transformers – Multilingual text embeddings
  1. Unstructured – A library that prepares raw documents for downstream ML tasks.
  2. Pytube – Python 3 library for downloading YouTube Videos.
  1. Diffusers[torch] – State-of-the-art diffusion in PyTorch and JAX.
  1. Torch – Tensors and Dynamic neural networks in Python with strong GPU acceleration
  1. Torchvision – image and video datasets and models for torch deep learning
  1. Ffmpeg – ffmpeg python package URL [https://github.com/jiashaokun/ffmpeg]
  1. Pyautogen – A programming framework for agentic AI
  1. Uvicorn – The lightning-fast ASGI server.
  1. Sse_starlette – SSE plugin for Starlette
  1. Langserve – LangServe helps developers deploy LangChain runnables and chains as a REST API.
  1. Fastapi – FastAPI framework, high performance, easy to learn, fast to code, ready for production
  1. Sagemaker – Open source library for training and deploying models on Amazon SageMaker.
  1. Lm-scorer – Language Model-based sentences scoring library
  2. Runpod – Python library for RunPod API and serverless worker SDK.
  1. Replicate – Python client for Replicate
  1. Keras-nlp – Industry-strength Natural Language Processing extensions for Keras.
  1. Gradientai – Gradient AI API
  1. Openai – The official Python library for the openai API
  1. Scikit Learn – A set of Python modules for machine learning and data mining
  1. TensorFlow – TensorFlow is an open-source machine learning framework for everyone.

 Tqdm – Instantly make your loops show a smart progress meter – just wrap any iterable with tqdm(iterable), and you’re done!

Keras – Keras 3 is a multi-backend deep learning framework, with support for JAX, TensorFlow, and PyTorch. Effortlessly build and train models for computer vision, natural language processing, audio processing, time series forecasting, recommender systems, etc.

Crewai – Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.

CrewAI_tools – Set of tools for the crewAI framework

Giskard[llm] – The testing framework dedicated to ML models, from tabular to LLMs

Os – The OS module in Python provides functions for interacting with the operating system.

Sentencepiece – Python wrapper for SentencePiece. This API will offer the encoding, decoding and training of Sentencepiece (Unsupervised text tokenizer for Neural Network-based text generation.)

Which one you have used most?

Tailored AI + LLM Coaching for Senior IT Professionals

In case you are looking to learn AI + Gen AI in an instructor-led live class environment, check out these dedicated courses for senior IT professionals here

Pricing for AI courses for senior IT professionals – https://www.aimletc.com/ai-ml-etc-course-offerings-pricing/

My Name is Nikhilesh and if you have any feedback/suggestions on this article, please feel free to connect with me – https://www.linkedin.com/in/nikhileshtayal/

Share it with your senior IT friends and colleagues
Nikhilesh Tayal
Nikhilesh Tayal
Articles: 75