AI & Data Privacy

Q: Will AI models be trained on my code?

No. We exclusively use enterprise-tier AI services that contractually prohibit using customer data for model training. Your code is processed and not retained — it is never used to improve or fine-tune AI models.

Q: Does my code get sent to third-party services?

For Standard and Enhanced tiers: sanitized code is processed by enterprise-tier AI services. We strip secrets, credentials, PII, and sensitive configuration before any external processing. The specific services are disclosed in your DPA. For Air-Gapped tier clients, no code leaves your environment — all AI processing happens on local, self-hosted models.

Q: Who owns the code you produce?

You do. All code and deliverables are assigned to you upon delivery. Our IP Assignment agreement ensures full ownership transfer. We retain no rights to use, reuse, or redistribute your custom code.

Q: Can you work with HIPAA-regulated data?

Yes. Our Enhanced tier includes HIPAA Business Associate Agreements (BAA), zero-day data retention with AI providers, and comprehensive audit logging. Our Air-Gapped tier goes further with fully self-hosted AI — no data leaves your infrastructure. We have deep healthcare domain expertise through our CodeFHIR heritage.

Our Approach to AI & Data Privacy

CodeAIHub uses AI-assisted development tools as an internal accelerator to deliver software faster. AI is part of our engine — not the product. You receive finished, production-ready software backed by expert human oversight.

We believe trust starts with transparency. This page explains how AI fits into our workflow, how your data is protected at every stage, and what legal safeguards are in place — so you can make informed decisions about working with us.

How AI Fits Into Our Process

We use enterprise-tier AI code analysis and generation services to accelerate development. These are proprietary internal tools and workflows — the specific platforms and models we use are part of our competitive methodology, much like any firm's internal toolchain.

Here's what you need to know:

What AI Does

Accelerates code generation from specifications
Assists with code analysis, review, and assessment
Generates test cases and documentation
Identifies patterns, bugs, and potential issues
Speeds up repetitive development tasks

What AI Does NOT Do

Replace human expert review and decision-making
Train on your proprietary code or data
Retain your data beyond the processing window
Make architectural decisions without human oversight
Access your raw code without sanitization first

How Your Data Is Protected

Data protection is a shared responsibility. We operate a dual-layer sanitization model: you prepare your codebase before handoff, and we run our own systematic scan before any AI processing. This ensures nothing slips through.

Client Preparation (Your Responsibility)

Before handing off your codebase, we ask that you:

Remove or rotate any production credentials, API keys, and secrets
Replace real test or seed data with synthetic data — especially PII or PHI
Provide environment-specific configuration separately, not in source files
Flag any files or directories containing particularly sensitive information

You know your codebase best. This first pass removes the most obvious sensitive data at the source.

CodeAIHub Sanitization Scan (Our Responsibility)

Regardless of client preparation, we run our own systematic scan on every codebase. Our pipeline detects and strips secrets, credentials, API keys, PII, connection strings, and sensitive configuration that may have been missed. AI services only see sanitized code structure and logic.

Secure Development Environment

Sanitized code is stored in a secure, access-controlled environment. Only authorized team members can access it.

AI-Assisted Processing

Clean code is processed by enterprise-tier AI services that operate under strict terms: no data retention beyond the processing window, no use of your data for model training, and encrypted transmission.

Expert Human Review

All AI-generated output is reviewed by our human experts before it becomes part of any deliverable. Nothing ships without human approval.

Shared Responsibility Model

You provide:

Best-effort cleanup of known secrets, real data, and production credentials before handoff.

We guarantee:

Systematic scan and scrub of the entire codebase. We are the last line of defense — we own the final responsibility.

Third-Party AI Services

In the interest of transparency: our Standard and Enhanced tiers use cloud-based, enterprise-tier AI code analysis and generation services as sub-processors. This means sanitized portions of your code are transmitted to these services for processing.

What we commit to regarding third-party AI services:

We only use services that contractually prohibit training on customer data
All data is transmitted over encrypted connections (TLS)
We enforce no long-term data retention by AI providers
Your code is sanitized before transmission — raw source code with secrets is never sent
Third-party AI services are listed as sub-processors in our DPA
For clients requiring zero external processing, our Air-Gapped tier uses only self-hosted, local AI models — nothing leaves your environment

The specific AI platforms and models we use are part of our proprietary methodology and are not disclosed publicly. However, full sub-processor details are provided in the Data Processing Agreement (DPA) signed as part of every client engagement.

Tiered Security Model

Choose the level of protection that matches your requirements

Standard

For most projects — strong protection with industry-standard practices.

Enterprise-tier AI services (no training on your data)
Code sanitization before any AI processing
NDA and DPA with sub-processor disclosure
Encrypted transmission (TLS)
Secure development environment
IP assignment — you own all deliverables

Recommended for Healthcare

Enhanced

For regulated industries and sensitive data — additional safeguards and audit trails.

Everything in Standard, plus:
Enterprise AI agreements with zero-day data retention
Comprehensive audit logs
HIPAA Business Associate Agreement (BAA)
Regular compliance reporting
Dedicated processing infrastructure

Maximum Security

Air-Gapped

For high-security clients — no data leaves your environment. Ever.

Everything in Enhanced, plus:
Self-hosted, local AI models only
Zero external API calls
On-premises or client-controlled infrastructure
Full data sovereignty
Custom security controls and audit integration

Legal Protections & Data Handling

Agreements & Contracts

Every engagement is backed by comprehensive legal agreements:

NDA — Non-Disclosure Agreement protecting your confidential information
DPA — Data Processing Agreement defining exactly how your data is handled, including sub-processor disclosure
IP Assignment — All code and deliverables are assigned to you. You own everything we produce for you.
Sub-processor Disclosure — Our DPA identifies the categories of third-party services involved in processing

Data Retention

We do not retain your source code or data beyond the active engagement. Upon project completion, all working copies are securely deleted per our data retention policy. You receive all final deliverables, source code, and documentation. A certificate of data destruction is available upon request.

Compliance Commitments

HIPAA Ready — Business Associate Agreements available for Enhanced and Air-Gapped tiers; deep healthcare expertise from our CodeFHIR heritage
SOC 2 Type II — On our compliance roadmap for formal certification
Cyber Liability Insurance — Professional coverage for added client protection

Frequently Asked Questions

No. We exclusively use enterprise-tier AI services that contractually prohibit using customer data for model training. Your code is processed and not retained — it is never used to improve or fine-tune AI models.

For Standard and Enhanced tiers: sanitized code is processed by enterprise-tier AI services. We strip secrets, credentials, PII, and sensitive configuration before any external processing. The specific services are disclosed in your DPA. For Air-Gapped tier clients, no code leaves your environment — all AI processing happens on local, self-hosted models.

You do. All code and deliverables are assigned to you upon delivery. Our IP Assignment agreement ensures full ownership transfer. We retain no rights to use, reuse, or redistribute your custom code.

Yes. Our Enhanced tier includes HIPAA Business Associate Agreements (BAA), zero-day data retention with AI providers, and comprehensive audit logging. Our Air-Gapped tier goes further with fully self-hosted AI — no data leaves your infrastructure. We have deep healthcare domain expertise through our CodeFHIR heritage.

All working copies of your code and data are securely deleted upon project completion, per our data retention policy. You receive all final deliverables, source code, and documentation. A certificate of data destruction is available upon request.

The specific AI platforms and models we use are part of our proprietary methodology — similar to how any technology firm's internal toolchain is considered a trade advantage. What we do disclose: the categories of third-party services involved, their data handling commitments (no training, no retention), and full sub-processor details in the DPA signed as part of every engagement. If you need to know specific providers for compliance or procurement review, we share that information under NDA.

Absolutely. For Enhanced and Air-Gapped tier clients, we provide audit logs, compliance reports, and are open to third-party security assessments. Transparency and accountability are core values — we welcome scrutiny of our practices.

Questions About Security?

We're happy to discuss your specific data privacy and compliance requirements in detail.