Public Review
Candidate reviews research lanes, program summaries, and boundaries.
Help test, benchmark, document, and improve Source’s AI tutors, agent harnesses, RAG corpora, model workflows, dashboards, local inference labs, custom Pi harnesses, and proprietary AI systems.
The AI Systems Researcher lane is the applied AI systems research and benchmark lane inside Source Research Lab. Its job is to evaluate whether Source’s AI systems are useful, reliable, grounded, measurable, auditable, and scalable.
Rather than accepting marketing claims of LLM capabilities, we stress-test models, prompts, corpora, and pipelines to establish empirical thresholds.
Researchers validate the components that form the foundation of Source’s training and execution stack:
Testing focus is locked to metrics, diagnostics, error logs, and verification runbooks rather than simple chat prompts.
Do not confuse this with enrollment. The training program is about learning how to use the systems. The research lane is about verifying whether the systems themselves function as intended.
Source University asks: Can this person learn the system? • Source Research Lab asks: Can this person help prove whether the system works?
"The training program is about becoming capable inside the system. The research lane is about testing whether the system itself is capable."
We require capable operators with technical discipline. While a PhD is not required, applicants must show structured reasoning and technical comfort under uncertainty.
Selected researchers are assigned specific testing scopes inside our sandboxes. These tasks focus on auditing pipeline performance, measuring cost routing, and identifying anomalies.
Validate tutor explanation consistency, feedback loops, and capability progression metrics.
Stress-test custom Pi harnesses, Codex task runs, and Claude Code execution pipelines.
Audit citation precision, semantic search chunking configs, and vector recall anomalies.
Evaluate specialized industry corpora used for local business diagnostics and reports.
Test approval queues, validation scripts, and human-in-the-loop safety frameworks.
Audit whether dashboards communicate operational trace logs, drawdowns, and latency paths.
Evaluate reply classification agents, campaign suppression rules, and delivery setups.
Test multi-agent build convergence, layer locking, and repository branch tracking.
Compare local vLLM performance with frontier APIs to balance routing cost and latency.
Audit runbooks, decision logs, and mistake tracking directories for ingestion safety.
The AI Systems Researcher lane is structured around ten active research programs. Each program is designed to turn vague AI capability claims into measurable research evidence.
We provide controlled, qualification-gated access to custom harnesses, private databases, and serving networks. Access is staged and tied directly to project workloads.
Source is interested in neuron-level hallucination mitigation research, including hallucination-associated neuron detection, activation steering, local model intervention experiments, groundedness evaluation, and reliability benchmarking.
*Safety Boundary: We make no claim that hallucinations are solved, that we possess non-hallucinating models, or that perfect factuality is guaranteed.
The AI Systems Researcher lane is strictly output-driven. We measure value by the quality and reproducibility of the scorecards, datasets, and memos submitted.
Reports measuring whether AI tutors improve learning, assignment quality, proof-of-work output, and operator readiness.
Reports measuring task completion, validation pass rate, restartability, auditability, model behavior, and tool-use reliability.
Reports measuring retrieval precision, recall, source grounding, citation faithfulness, chunking quality, and hallucination behavior.
Reports identifying unsupported claims, fabricated citations, missing sources, overconfidence, and grounding failures.
Side-by-side comparisons of frontier models, local models, coding agents, judge models, embedding models, and rerankers.
Reports identifying which models should handle which tasks, with cost, latency, quality, and risk considerations.
Reports measuring latency, throughput, quality, cost, privacy, local-vs-frontier performance, and deployment tradeoffs.
Reports reviewing whether dashboards expose useful state, alerts, queues, failures, readiness, and next actions.
Reports identifying failure modes, validation gaps, review burden, false approvals, and human-in-the-loop improvements.
Reports reviewing multi-agent build runs, branch variants, layer locks, critique loops, telemetry, and convergence patterns.
Reports evaluating canonical artifacts, SOPs, handoff packets, mistake logs, decision logs, and retrieval-ready documentation.
Public or private artifacts demonstrating real research contribution, where mutually approved and appropriate.
Every formal research memo submitted to the lab is structured to include:
Not every researcher receives access to every system. Source Research Lab access is qualification-based, project-scoped, permissioned, staged, revocable, and confidentiality-bound.
Candidate reviews research lanes, program summaries, and boundaries.
Candidate submits background, proof-of-work, research interests, and availability.
Source evaluates technical capability, logical clarity, and confidentiality readiness.
Candidate matched to specific, scoped AI Systems programs.
Researcher receives scoped access keys for target compute tools.
Researcher submits structured memos, code runs, and scorecards.
Strong candidates evaluated for paid collaboration or deeper tasks.
Selected researchers may receive controlled, qualification-based access to Source research environments, proprietary configurations, custom harnesses, private corpora, model workflows, and dashboards depending on project fit, approval, licensing, confidentiality, and availability.
Certain configurations, internal benchmarks, and experimental systems are held back from public visibility and may only be discussed with qualified candidates after review. Access to compute, terminals, paid APIs, local inference setups, or frontier models is not guaranteed.
We operate a serious applied research space. Verify these non-guarantees prior to submitting consideration logs.
This is not a prompt bootcamp or slide presentation track. Value is measured strictly by active test sandbox outputs.
We do not provide free API tokens or server access for personal research projects. Resources are locked to matched scopes.
Vetting models does not guarantee employment, paid fellowships, or contract engagements with Source.
This page is not offering unrestricted access to models, servers, tools, corpora, or proprietary Source systems. Access is tied strictly to qualification, trust, project fit, confidentiality, availability, resource cost, clear scope, and useful research output.
We vet candidates systematically. Review these targeted questions to prepare your research log submissions.
Review clarification details concerning testing scopes, requirements, and bench routes.
If you are serious about helping test, benchmark, document, and improve advanced AI learning systems, agent harnesses, RAG corpora, model workflows, dashboards, local inference labs, custom Pi harnesses, and proprietary Source research environments, you may apply for AI Systems Research review.