🔬

Science Video Database

Curated Search Experience for Technical Science Enthusiasts

A curated search experience featuring biology, chemistry, computer science, mathematics, and physics videos from YouTube and other sources, searchable by transcript.

📚 Prior Work & Research Contributions

Overview

The Science Video Database represents prior work that demonstrates the creation of a curated, searchable database of scientific video content. This project establishes a foundation for integrating video-based learning and research content into the CopernicusAI Knowledge Engine, enabling multi-modal knowledge exploration through searchable transcripts and filtered scientific video content.

🔬 Research Contributions

  • Curated Video Collection: Filtered scientific videos from YouTube and other sources
  • Transcript-Based Search: Searchable video database using transcript content
  • Multi-Disciplinary Coverage: Biology, chemistry, CS, mathematics, physics
  • Integration Framework: Designed for CopernicusAI Knowledge Engine integration

⚙️ Technical Achievements

  • Hybrid Search System: Transcript-based search with filtering capabilities
  • Ingestion Pipeline: Automated video ingestion and transcript processing
  • Vector Database Integration: Support for semantic search using embeddings
  • Scalable Architecture: Designed for scaling from 2k to 200k+ videos

🎯 Position Within CopernicusAI Knowledge Engine

The Science Video Database serves as a multi-modal content component of the CopernicusAI Knowledge Engine, providing:

  • • Video content integration for learning
  • • Transcript search for research discovery
  • • Multi-modal learning support
  • • Research Paper linking potential
  • • AI Podcast enhancement

This work establishes a proof-of-concept for AI-assisted video content management in scientific research, demonstrating how searchable transcripts can enable systematic discovery and integration of video-based knowledge.

🎬 Live Demo

Access the live Science Video Database application with searchable transcripts and curated video content.

🔬 Open Science Video Database

🚀 Project Milestones

Prototype (Current)

  • • 10-15 channels, ~2k videos
  • • Basic ingestion pipeline
  • • Transcript storage
  • • Hybrid search UI with filters

Alpha (Next Phase)

  • • 50+ channels, 20k videos
  • • Personalization MVP
  • • Email digests
  • • Improved UX polish

Scaling (Future)

  • • 200k videos
  • • Autoscaling workers
  • • Admin dashboard
  • • Automated QA

🔧 Technical Architecture

Frontend

  • • Next.js 14
  • • React Server Components
  • • Hybrid search UI

Backend

  • • Node.js/TypeScript
  • • Serverless functions
  • • Ingestion worker

Database

  • • PostgreSQL
  • • Vector DB
  • • Search Engine

Cloud

  • • Google Cloud Platform
  • • Vertex AI
  • • Cloud Run

🔗 Related Projects

🔬 CopernicusAI

Main knowledge engine that can integrate video content with AI podcasts and research synthesis.

Visit CopernicusAI →

📚 Research Paper Metadata Database

Potential integration for linking videos to research papers and metadata.

Visit Metadata Database →

🧬 GLMP

Biological process visualization that could link to related educational videos.

Explore GLMP →

🛠️ Programming Framework

Process analysis tool that could utilize video content for process explanations.

Explore Framework →

How to Cite This Work

Welz, G. (2024–2025). Science Video Database.
Hugging Face Spaces. https://huggingface.co/spaces/garywelz/sciencevideodb

This project serves as infrastructure for AI-assisted video content discovery in scientific research, enabling systematic search and integration of video-based knowledge through transcript-based discovery.

The Science Video Database is designed as infrastructure for AI-assisted science, providing multi-modal content discovery capabilities within the CopernicusAI Knowledge Engine.