It is necessary to create the legal risk map of AI training data from scratch.
Data sources and risk
- Web scraping: robots.txt + ToS + KVKK art.5.
- User data: legitimate interest + disclosure.
- Third party data: license agreement.
Output responsibility
- Copyright violation (copying things learned from the work).
- Slander, insult results.
- Hallucination = TBK general damage liability.
Contract clauses
- Use limitations.
- Indemnity parties.
Frequently asked
Can I scrape the public website?
If robots.txt is allowed, technical OK; KVKK is a separate legal reason.
Is user data opt-out sufficient?
Generally no; Explicit consent + enlightenment.
What if the output is copyright infringement?
Provider + user chain liability; contract indemnity critical.
Relevant legislation
- Turkish Commercial Code art.331-644 — A.Ş. and LTD establishment, share transfer.
- FSEK art.2/1-1 — Protection of software as a work.
- SMK No. 6769 — Trademark, patent, utility model, design.
- KVKK article 12 — Data security, by-design principle.
- TBK art.193 et seq. — Contracts, guarantees, indemnification.