artificial-intelligence

    KVKK-Compliant Artificial Intelligence Guide: A 5-Layer Practical Architecture

    A five-layer architectural guide for KVKK-compliant enterprise AI systems: data residency, explicit consent, anonymisation, cross-border API risk and audit trail. An engineering note from eCloud Tech.

    Published: May 25, 202611 min read
    kvkkartificial-intelligenceai-governanceaigency

    The most common blocker for AI projects is not the model's technical quality; it is KVKK compliance that cannot be added after the fact. A prototype runs in two weeks, but the KVKK refactor required to take that prototype into enterprise production usually takes three months. What our AI governance service team has observed over the last 18 months: the earlier you include KVKK in the project, the lower the total cost. Day-one KVKK architecture costs 3-5× less than retrofitted architecture.

    This article presents the five-layer practical architecture for KVKK-compliant enterprise AI deployment. Legal framework, technical implementation and audit readiness in one piece. We share seven practices our engineering team has learned while building the AIGENCY V4 platform to be KVKK-compliant and while applying that knowledge across customer projects.

    1. Data residency — the technical meaning of KVKK Article 9

    KVKK Article 9 defines two paths for cross-border transfer of personal data: either the data subject's explicit consent, or transfer to a country on the Board's approved list. As of February 2026, the US is not on that list; EU countries fall under the "adequate protection" framework, and most cloud providers outside Türkiye (AWS US, Azure US, GCP US) require an Article 9 assessment for every query.

    The practical meaning: every user query sent to the ChatGPT API can fall within KVKK violation scope without the querier's explicit consent. Three ways to address this:

    Path 1: Don't send personal data at all. Strip person-identifying fields (name, surname, phone, email, ID, IP) from the query; send only anonymous context to the AI system. This approach requires a preprocessing layer (PII redaction) and is open to leakage under misconfiguration.

    Path 2: Obtain explicit consent from every user. A "this data may be sent to a foreign AI service" approval before the query; consent record kept in an auditable log. Creates user friction for enterprise customers.

    Path 3: Use a Türkiye-hosted alternative. Platforms hosted in Türkiye and designed for day-one KVKK compliance — such as AIGENCY V4 — eliminate the Article 9 problem directly. As an architectural decision, this is the most sustainable approach.

    Regardless of which path you choose, a documented rationale for the architectural decision must exist. If your answer to "why did you choose this path?" is not written down during a KVKK audit, you stall in the audit process.

    KVKK Article 5 sets six fundamental processing conditions; the two most often used in AI projects are explicit consent (Art. 5/1-a) and legitimate interest (Art. 5/2-f). The choice between them is a legal decision rather than a technical one, but it carries significant architectural consequences.

    Explicit consent is a specific, informed and freely given approval. The hard part in an AI context: the user understanding "what they are consenting to". Broad wording such as "do you consent to your data being processed in our AI system?" is treated as invalid by KVKK Board opinions. Correct consent text: "I consent to processing of my health data (blood tests, imaging results) only for diagnostic-support purposes, only in the AI module operated by eCloud Tech, within the borders of Türkiye." — purpose + data type + processor + location, all four elements.

    Legitimate interest is a balancing test where the data controller's or a third party's interest outweighs the data subject's fundamental rights and freedoms. Fits low-risk uses in an AI context — customer-support chatbot, content personalisation, fraud detection. The three-step balance test (LIA — Legitimate Interest Assessment) must be performed in writing:

    1. Is there a legitimate interest? Business purpose, concrete benefit defined.
    2. Necessity test: Can the same outcome be achieved with less data?
    3. Balancing: Does the user experience an unexpected use; is the right to object actionable?

    Practical rule: where special-category personal data (health, biometric, ethnicity, religious belief, etc.) is involved, explicit consent is mandatory under Article 6; legitimate interest does not suffice in this category. For standard personal data, legitimate interest is sufficient and operationally workable in most B2B AI projects.

    3. Anonymisation — irreversible transformation

    KVKK Article 7 covers the right to erasure; but erasure is technically hard in AI systems because data may be embedded in model parameters. The architectural solution to this problem is anonymisation.

    Anonymisation is an irreversible data transformation. Pseudo-anonymisation (e.g. hashing the national ID) is still personal data under KVKK Article 28 because the holder of the hash key can re-identify. True anonymisation is achieved by combining three techniques:

    K-anonymity: Every record must look like at least k other records. Example: a birth year + postcode + sex combination must occur in at least 5 records (k=5). A dataset pointing to a single individual is rounded into broader buckets (year → decade, postcode → city).

    Differential privacy: Controlled noise is added to the dataset; individual records become invisible while aggregate statistics remain intact. Apple and Google have used this for years; it is also applied to AI training sets.

    Synthetic data generation: A dataset that resembles real data but belongs to no real individual. The safest approach for AI training; statistically aligned with the original but carries no erasure/objection risk.

    Our practice: in the AIGENCY V4 platform, the fine-tuning pipeline applies all three techniques together; an automated "personal data risk assessment" runs before training, and data that fails is excluded. This architectural decision protects us during later model-update or erasure requests.

    4. RAG architecture — combining the right to erasure with AI capability

    One of the most elegant solutions in AI architecture for this layer is the RAG (Retrieval Augmented Generation) approach. Instead of embedding user data into model parameters, RAG retrieves it at query time.

    The logic: the model is trained on general knowledge (Turkish language, reasoning, common concepts); enterprise data is held separately in a vector database. When a query arrives, the system first retrieves relevant documents from the vector database, then prompts the model to "produce an answer in the context of these documents".

    The KVKK advantage of RAG is clear: when an erasure request comes, you do not need to retrain the model. You delete the relevant record from the vector database; subsequent queries will not retrieve it. This architecture aligns technically with KVKK Article 7.

    In our RAG systems engineering service the standard structure:

    1. Vector database layer (pgvector / Qdrant / Weaviate): enterprise documents are embedded and stored here. Each document is tagged with source reference and KVKK metadata (data subject, processing purpose, retention period).
    2. Retrieval layer: the user's query is embedded; the N semantically nearest documents are pulled. The authorisation layer applies here — documents the user is not authorised to access are filtered out.
    3. Generation layer: the model is given the retrieved documents + query and produces an answer in that context.
    4. Citation layer: the answer comes with references to source documents — the user can verify, the auditor can audit.

    This architecture is both technically superior (low model-hallucination risk, verifiable answers) and KVKK-sustainable. 70% of our current customer projects use this architecture.

    5. Authorisation and audit logs — the technical counterpart of KVKK Article 12

    KVKK Article 12 defines "data security measures"; the technical counterpart in AI systems is role-based access control and immutable audit logs.

    Role-based access: Which role can access which data, under which conditions? Especially critical in AI systems because a single user query can trigger multiple data sources. Example: a doctor can access their own patient's medical history but not another doctor's patient. The retrieval layer of the AI system must check the user's role before the query; otherwise the RAG system inadvertently exposes unauthorised data.

    Audit log: For every query the following fields are automatically captured:

    • Timestamp (microsecond precision)
    • User identity + role
    • Data source accessed + specific record IDs
    • Query text (post PII redaction)
    • Answer summary
    • Model called + version
    • Outcome of the authorisation check

    This log is kept append-only (cannot be overwritten or deleted), stored encrypted, accessible for at least 12 months back. A KVKK Board audit request must be answerable with a dump within 24-48 hours.

    An important practical note: the audit log itself contains personal data (identity of the querier). Log retention cannot be indefinite; typically after 24 months the user identity is anonymised, leaving only operational statistics. This architectural decision aligns with KVKK Article 4/2-d (limited retention).

    6. Cross-border LLM API integration — risk transfer in practice

    Many organisations would rather use a foreign LLM API than build their own AI platform from scratch; the cost and speed advantages are concrete. But the KVKK risks of this choice produce expensive outcomes if unmanaged.

    A risk-mitigation layer is mandatory:

    RiskArchitectural control
    Personal data leaking to the foreign APIAutomatic PII redaction before the query (regex + ML-based detection)
    Missing explicit consentA separate consent module for AI queries in the user record; fallback to a Türkiye-hosted model where consent is absent
    Provider storing the data"Do not log" + "do not train" clauses mandatory in the API contract; OpenAI Enterprise / Anthropic Business offer these options
    Data interception in transitTLS 1.3 mandatory + certificate pinning
    Provider breach72-hour notification readiness to the KVKK Board after the incident (Art. 12/5)

    A practical case: in early 2025 we consulted a law firm that produced petition drafts with GPT-4. Our assessment: client data flowed to OpenAI without explicit consent, "do not train" was not in the contract, audit log was missing. After a three-week refactor the structure became: PII redaction layer + client consent + fallback to AIGENCY V4 (Türkiye-hosted) + evidence chain for every query. After the refactor the firm passed a KVKK audit smoothly.

    Our recommendation: in projects involving personal data, the primary AI platform should be Türkiye-hosted; use foreign APIs only with anonymous context or with an additional consent layer. This is a balanced approach, not "all or nothing".

    7. Audit readiness — the proactive approach

    A KVKK audit does not begin on the day it arrives; you must be prepared before it arrives. In our customer projects the standard audit-readiness package contains five components:

    Component 1: Data flow diagram (DPIA — Data Protection Impact Assessment). Where personal data enters the system, where it is processed, where it goes, how long it is kept; with a legal basis label on every arrow. Not a standard architecture diagram; prepared in KVKK format.

    Component 2: Explicit consent inventory. How many of each consent type exist, when they were collected, against which version. Whether old consents become invalid when the consent text changes — the answer must be in writing.

    Component 3: Audit log dump template. When the auditor arrives, in how many minutes can you produce "all AI queries by this user in the last 6 months"? Anything over 24 hours is risk.

    Component 4: Data-breach response plan. To meet the 72-hour notification window required by Art. 12/5, the internal process must be written; who gets notified, who decides, who writes the notification text must be defined.

    Component 5: Continuously up-to-date registry (VERBİS — Data Controllers Registry). If the organisation has a VERBİS entry, the addition/update of the AI system must be recorded. An outdated VERBİS invites additional audit questions.

    An organisation that embeds these five components in the architecture from the start treats the audit as a routine day. An organisation that adds them later treats the audit as a crisis.

    In our practice these five components are part of every AI project's delivery package; not a separate line item. A customer doesn't order "be ready whenever the audit comes" — it is the default behaviour. The concrete benefit of this approach: of the six KVKK audits we managed over the past three years, all six closed without additional document or clarification requests. The audit averaged four working days; the sector average for comparable size is two to three weeks.

    Decision matrix: the right approach for your organisation

    Which architecture is right for you? Three questions for a practical evaluation:

    QuestionAnswer → recommendation
    Does the AI system process special-category personal data (health, biometric, ethnicity etc.)?Yes → Türkiye-hosted (AIGENCY V4 class) is mandatory; explicit consent required
    Would embedding user data into model parameters create a problem (erasure/objection rights)?Yes → RAG architecture mandatory; retrieval over fine-tuning
    Is using a foreign LLM API an operational must?Yes → PII redaction + contractual clauses + consent layer mandatory

    For an organisation answering "yes" to all three, full KVKK-compliant architecture is unavoidable; with two "yes" answers, a hybrid approach is possible; with one, standard control set suffices.

    Our enterprise AI platform deployment service includes the five layers and audit readiness as a standard package — not a bolt-on, but day-one architecture. Typical project length: 8-14 weeks; complex multi-system integration can extend to 20.

    Next step

    A KVKK-compliant AI deployment is not a "can it be done / can't it" question; it is a "within which boundaries can it be done" question. When the five layers presented here (data residency, explicit consent, anonymisation, RAG architecture, authorisation + audit) are evaluated at design time, the cost impact usually stays in the 5-10% range; added later, the same compliance requires 30-50% additional development time.

    If you are planning an AI deployment in your organisation, we suggest filling in the five layers with your own answers before a one-hour technical call with our team. Recognising which layer has a gap accelerates the next steps considerably. We accept preliminary calls via our contact page; in the call we work through an adapted evaluation of the five layers for your scenario.

    The next posts in this series will be announced on our blog — planned titles include "Turkish-language LLM fine-tuning and data anonymisation" and "The impact of the EU AI Act × KVKK intersection on Turkish organisations". If a topic is a priority for you, mentioning it in your enquiry lets us share the relevant technical material.

    Frequently Asked Questions

    Not automatically, but it carries serious risk. KVKK Article 9 requires either explicit consent or a country on the Board's approved list for cross-border data transfer. The US is not on that list; data sent to OpenAI, Anthropic, Google is evaluated under Article 9. Solutions: (a) send only anonymous queries with no personal data, (b) obtain explicit consent from the user and log it auditably, (c) use a Türkiye-hosted alternative (such as AIGENCY V4). The third is the cleanest; the first two impose continuous audit overhead.

    Related articles