Proposal
EventLogging schema for SpecialMuteSumbit is only kept for 90 days. We hope we can sanitize user level info (ueragent, ip, geocoded_data) and keep action event logs (defined in https://meta.wikimedia.org/wiki/Schema:SpecialMuteSubmit) and pipeline to event_sanitized database.
ip | useragent | uuid | seqid | dt | wiki | webhost | schema | revision | topic | recvfrom | event | geocoded_data | year | month | day | hour | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
sanitized | sanitized | keep | keep | keep | keep | keep | keep | keep | sanitized | keep | keep | sanitized | keep | keep | keep | keep | |
The description of the kept columns:
uuid : Unique event identifier
seqid: Udp2log sequence ID
dt: datetime
wiki: wiki project
webhost: Request host.
schema: Title of event schema
revision: Revision ID of event schema
recvfrom: Hostname of server emitting the log line
event: The encapsulated event object
schema doc:
https://meta.wikimedia.org/wiki/Schema:EventCapsule
https://meta.wikimedia.org/wiki/Schema:SpecialMuteSubmit
Stages:
- Legal approval for data retention
- Add schemas to EventLogging whitelist