User talk:Danil Satria

From Wikidata
Jump to navigation Jump to search

Tanya quickstatements

[edit]

Halo mas danil,

Barusan saya melihat mas danil berkontribusi dengan quickstatements, saya mau tanya utk memasukkan data dengan quick statement sebanyak itu mas danil butuh waktu berapa lama untuk menyusun csv? Empat Tilda (talk) 14:04, 8 August 2022 (UTC)[reply]

Yang lama itu mencocokkan data dengan QID-nya sembari memastikan tidak ada duplikat, redirect, dll (check/recheck/cross-check). Sebagai gambaran... 91.281 data kode wilayah administratif Indonesia (P2588) memerlukan hampir 2 bulan untuk perapihan (dari awalnya berupa tabel PDF), sementara uploadnya sendiri hanya sekitar 2 hari.
Kalau data sudah siap, untuk menjadikan perintah QuickStatements (syntax v1) di LibreOffice Calc/Excell bisa disiapkan tiga kolom:
1. Subjek → Q#### (QID)
2. Predikat → P#### (klaim/statement)
3. Objek → Q####/URL/Tanggal/dll (value)
Ketiga kolom tersebut bisa langsung di-copy+paste ke form QuickStatements.
Atau kalau pakai OpenRefine tinggal pakai menu export to QuickStatements.
Semoga membantu :-) --Danil Satria (talk) 15:53, 8 August 2022 (UTC)[reply]

Merge suggestions on Swedish wikipedia

[edit]

For example I assume you mean Jalancagak (Q952946) and Jalancagak (Q25163684) are the same. The reason Swedish (and Cebuano) has two is that somehow it was entered in geonames twice. I think we should not merge instead one should be delete (less work but it must be done by an admin). If we do that, is one geonames better than the other or should we assume the geonames with more interwiki than ceb and sv is the one to keep? Also how is Jalancagak (Q14632402) fit in? Same as geonames 1642893 Jalancagak Maundwiki (talk) 23:26, 6 September 2022 (UTC)[reply]

For example I assume you mean Jalancagak (Q952946) and Jalancagak (Q25163684) are the same. The reason Swedish (and Cebuano) has two is that somehow it was entered in geonames twice.
For administrative sub-divisions (and rivers) at least within Indonesia, GeoNames tend to be messy... causing the same problem in OpenStreetMap (so many duplicates and coordinate inaccuracies), which is also why I had to take a break there and focus on Wikidata for a while.
I think we should not merge instead one should be delete (less work but it must be done by an admin).
I'm just a new editor on Wikidata (and have been inactive on Wikipedia for a very long time) so I'm not sure of all the available options. IMHO, each GeoNames value should be retained to make it easier to find existing duplicates and prevent others from appearing.
In Wikidata it might be possible to replace single-value constraint (Q19474404) with single-best-value constraint (Q52060874) in GeoNames ID (P1566), while on Wikipedia I'm more in favor of just deleting one of them. But I will leave it to the admin of each project, after all this process is only up to the third level of sub-division admin for now... there will still be many if we have processed the fourth level.
If we do that, is one geonames better than the other or should we assume the geonames with more interwiki than ceb and sv is the one to keep?
Taking Jalancagak as an example, the GeoNames coordinates in Q952946 are in farmland and GeoNames coordinates in Q25163684 are on a mountainside. I can't decide which one is more correct, the most representative coordinate in both district items is the first coordinate in Q952946 which uses the location of the district office.
Also how is Jalancagak (Q14632402) fit in?
It is a village within a district with same name, from Jalancagak (Q14632402) P2588 value (32.13.12.2017) we can interpret it as:
  • 32 → West Java Province
  • 13 → Subang Regency
  • 12 → Jalancagak District
  • 2 → it is a village (1 = kelurahan/urban village, 3 = desa adat/traditional village, 4 = island)
  • 017 → Jalancagak [village]
We better use administrative code of Indonesia (P2588) for administrative sub-divisions of Indonesia which is based on official catalog released by Indonesian Ministry of Home Affairs every 1 or 2 years, the latest being Indonesian Minister of Home Affairs Decree Number 050-145 of 2022 (Q112136164).
Same as geonames 1642893 Jalancagak
It seems to be duplicate of 6581483 which is more suitable for Jalancagak (Q14632402). But looking at the coordinate position, it is even closer to the district head office, so it might also be another duplicate of 6581497 and 6583304, I don't know. The coordinates of the village office that I just added (in Wikidata and OpenStreetMap) while checking earlier also could not use either of the coordinates from both GeoNames entries. -- Danil Satria (talk) 06:58, 7 September 2022 (UTC)[reply]
Thanks for the information and for bringing it to our attention.
I have been working on issues after the boot for a few years for a few countries both in wikipedia and wikidata. This is the first time I have seen a lot of duplicated records in geonames on the same level (ADM3).
I agree everything should be based on the administrative sub-divisions of Indonesia and I assume the wikidata object to keep has administrative code of Indonesia (P2588).
The location of a district can be based on administration building, center of the polygon for the area, anywhere within the polygon. Eventually the coord in wikidata will be correct and wikipedia should use that coord.
I will fix wikipedia (only one article in both sv and ceb) and ask for deletion of the other wikidata object.
Maundwiki (talk) 12:05, 7 September 2022 (UTC)[reply]
> This is the first time I have seen a lot of duplicated records in geonames on the same level (ADM3).
We haven't gotten to the grimmer part yet, wait until we start processing ADM4, there are thousands of possible duplicates already detected (27945 to be exact o_O)
> I agree everything should be based on the administrative sub-divisions of Indonesia and I assume the wikidata object to keep has administrative code of Indonesia (P2588).
My future updates will also be based on P2588, which the source document will soon be updated by the ministry due to the approval of 3 new provinces (splitting of Papua Province) on 30 June 2022.
> The location of a district can be based on administration building, center of the polygon for the area, anywhere within the polygon.
I agree that the centroid coordinate of the administrative boundary polygon are valid values, GeoNames is not wrong in this regard. I use administrative offices to make it less confusing, sometimes centroid locations can get weird.
> Eventually the coord in wikidata will be correct and wikipedia should use that coord.
It has to be, I'll help from OpenStreetMap. But first we need to hunt down all the duplicates here so that Wikidata can becomes a reliable open data hub.
> I will fix wikipedia (only one article in both sv and ceb) and ask for deletion of the other wikidata object.
I'm curious why sv and ceb always appear together? -- Danil Satria (talk) 07:37, 8 September 2022 (UTC)[reply]
Same bot lsjbot using geonames as database wrote articles in both sv and ceb. Wikipedia sv and ceb is a "copy" of geonames from around 2015. What lsjbot created is no better than geonames at that time. The data quality in geonames varies by country and the quality of the source used for geonames. I think for Indonesia data was entered from two sources that had the same ADM3 (and ADM4) with different coordinates. This is what gave us duplicates. I do not think they should be merged however one should be removed. As a non admin I can achieve this with redirects on wikipedia. We should not merge the wikidata object without emptying one object to avoid redundant information in the combined object. Merging keeps history for everyone, the alternative requesting erase in wikidata may be a cleaner solution since it was based on a incorrect source. I am not sure of what to do in wikidata. I am here only talking about when geonames had two records of ADM3 or ADM4 for one place/area.
ADM4 You mean like Desa Tambakan six records in geonames and none next to each other. My guess is that the pairs in East Java and Central Java are duplicates and that there only should be four listed.
Maundwiki (talk) 12:22, 9 September 2022 (UTC)[reply]
After quick look, there are actually 6 of them:
1. Tambakan (Q14750328) Buleleng Regency, Bali
2. Tambakan (Q12518933) Pasuruan Regency, East Java
3. Tambakan (Q14750329) Blitar Regency, East Java
4. Tambakan (Q12518936) Klaten Regency, Central Java
5. Tambakan (Q11047733) Grobogan Regency, Central Java
6. Tambakan (Q11136446) Subang Regency, West Java
So maybe there is nothing need to be done in sv & ceb wiki side, just merging the item in Wikidata with item from other wiki that use different sources.
Thousands of possible duplicate that I mentioned earlier usually caused by different sources used in different wikipedia... ceb & sv use GeoNames, nl and some other use P1588 (previously labelled as “village code of Indonesia”), there is also some P2588 has been added to Wikidata, and then we got 3 different QID for one place.
That's why I can say it is still doable even with so many item to track down, most of them can be done in batch if we can figure out how to connect the source (something like https://sig.bps.go.id/bridging-kode/ that connect P1588 with P2588). -- Danil Satria (talk) 15:03, 9 September 2022 (UTC)[reply]
I have started fixing the once you suggested should be merged by redirects in wikipedia and removing data in wikidata. Paga (Q13095607), when we merge in wikidata I think one of the coordinate location (P625) should have preferred value or be removed in the geonames created object prior to the merge.
When I see sv/ceb wikidata objects that have a potential matching object I will mark those with said to be the same as (P460). Maundwiki (talk) 21:31, 21 September 2022 (UTC)[reply]
Okay, I will monitor https://w.wiki/5jSu then. For duplicate coordinates, usually pre-2019 Geonames coordinates are not reliable... my plan is to do this step through OpenStreetMap after all Wikidata duplicate items are cleared. -- Danil Satria (talk) 01:07, 23 September 2022 (UTC)[reply]

Ambarawa

[edit]

This is a diffrent issue. Ambarawa (Q2747287) a district of Indonesia (Q3700011) but with geonames type PPL and Ambarawa (Q25135266) a district of Indonesia (Q3700011) with geonames typ ADM3. The ceb and sv in Ambarawa (Q25135266) should be moved to Ambarawa (Q2747287). If there is a town seperate from the district the sv ort and ceb lungsod (villages/towns) should have it's own wikidata or be merged with an existing wikidata, I think in this case Kenteng (Q12490888) even if it is missing administrative code of Indonesia (P2588). This mix up of lowest ADM level and PPL is what I correct most of the time and it was caused by inserting geonames of PPL typ 2014 when it should have been geonames typ ADM3 6410717. The bot that did this 2014 used a lot of incorrect information. Maundwiki (talk) 12:34, 7 September 2022 (UTC)[reply]

> This is a diffrent issue. Ambarawa (Q2747287) a district of Indonesia (Q3700011) but with geonames type PPL and Ambarawa (Q25135266) a district of Indonesia (Q3700011) with geonames typ ADM3.
Ambarawa is quite a popular place in Indonesia as Battle of Ambarawa (Q13095653) are mentioned in school history textbooks. If we want to create separate item, the other one would probably be battlefield (Q4895508).
> The ceb and sv in Ambarawa (Q25135266) should be moved to Ambarawa (Q2747287). If there is a town seperate from the district the sv ort and ceb lungsod (villages/towns) should have it's own wikidata or be merged with an existing wikidata,
✓ Done I have swapped the position of sitelinks and P1566, feel free to remove instance of (P31)Wikimedia duplicated page (Q17362920) if you want to make Q25135266 as separate item.
> I think in this case Kenteng (Q12490888) even if it is missing administrative code of Indonesia (P2588).
No, please don't. Most Indonesian ADM4 items have ADM4, ADM3[, ADM2] as aliases and sometimes as labels. I don't know where the consensus is documented but when I came here (to Wikidata) it was already like that, it is actually quite useful when doing searches. By looking at Q12490888 label (Kenteng, Ambarawa) we can tell that Kenteng is a place within Ambarawa, not Ambarawa itself.
From Indonesian Minister of Home Affairs Decree Number 050-145 of 2022 (Q112136164) source document page 1608 there is indeed Kenteng without a code at 3rd row from bottom with a description: "Menjadi wil. Kec. Bandungan, Perda No. 1/2006" (English: Became part of Bandungan district, Perda No. 1/2006)
The Bandungan district itself can be found on page 1612 of the same document, where Kenteng is given new code 33.22.20.2004 with a description "Semula wil. Kec. Ambarawa" (English: Originally part of Ambarawa district) and it turns out it's already on Wikidata in Q12490891. As you can see, here Kenteng also has the alias "Kenteng, Bandungan, Semarang" (ADM4, ADM3, ADM2) as I said before.
Q12490891 with Q12490888 is what we need to merge (already done), leaving around 27944 more with possible similar cases... tricky but doable.
> This mix up of lowest ADM level and PPL is what I correct most of the time and it was caused by inserting geonames of PPL typ 2014 when it should have been geonames typ ADM3 6410717. The bot that did this 2014 used a lot of incorrect information.
This information is a bit scary, the Indonesian data in GeoNames as of 2022-06-02 that I tried to process (but failed because it was so chaotic) lists 243048 PPL entries. While the official Ministry of Home Affairs document for 2022 only lists ADM1-ADM4 as 91281, even if we add the 3 new provinces as well as places that are not officially recognised (level 3.5 and 5, yes there are some) it feels like that number is too high. I really don't understand where they got the data from, please don't tell me that the bot you mentioned already added everything.
-- Danil Satria (talk) 07:48, 8 September 2022 (UTC)[reply]
I agree with the switching of ceb and sv between Ambarawa (Q25135266) and Ambarawa (Q2747287) and changing geonames. The cause of the incorrect interwiki is the change i Ambarawa (Q2747287) from "30 oktober 2014 kl. 08.49‎". The sources used by that bot has caused a lot of incorrect interwiki since another bot assigning interwiki to lsjbot created articles used geonames as the key. However in addition to changing geonames and re-mowing Wikimedia duplicated page (Q17362920) the label and description should change for both and Ambarawa (Q25135266) should have a diffrent instance of (P31). I usually just give it geographic location (Q2221906) however you may have something better for small places (villages?) in Indonesia.
The issue if the articles based on geonames PPL should have interwiki with another wikidata object or be assigned an administrative code of Indonesia (P2588) is later issue. It may even turn out that the data for the PPL in geonames can not be verified.
Maundwiki (talk) 13:34, 9 September 2022 (UTC)[reply]
I think geographic location (Q2221906) or human settlement (Q486972) good enough for articles based on geonames PPL for now, discoverable and the external identifier there can prevent (or make it easier to detect) another duplicate from that source. Not every settlement in Indonesia have P2588, it just indicating formal recognition with financial support from government.
Maybe in future update of source document for P2588 we will see Ambarawa (Q25135266) get 33.22.10.3xxx code (3 at last part is for “desa adat”/cultural settlement, currently it just reserved for future use).
-- Danil Satria (talk) 16:01, 9 September 2022 (UTC)[reply]
P.S. Shouldn't located in the administrative territorial entity (P131) in Ambarawa (Q25135266) be an ADM4 object? Not in a province of Indonesia (Q5098). However this is part of the work to se if the PPL is valid or not. Maundwiki (talk) 13:40, 9 September 2022 (UTC)[reply]
Agree for not using ADM1 and Ambarawa (Q2747287) (ADM3) should be okay as P131. Looking at GeoNames web map, there are so many similar entries (one with ADM and another one with PPL) like 6257714 Kelurahan Kranggan = 1964761 Kranggan, 6257736 Kelurahan Lodoyong = 6257719 Lodoyong. That make me believe both Ambarawa entry are just another mess. -- Danil Satria (talk) 01:55, 10 September 2022 (UTC)[reply]

Kecamatan Burneh/Kecamatan Bangkalan

[edit]

This has changed in geonames. Geonames Kecamatan Bangkalan] was changed 2021-08-07 19:49:43.623 by nga from Kecamatan Burneh. That means "Kecamatan Burneh (distrikt i Indonesien, lat -7,03, long 112,74)" should change name to Kecamatan Bangkalan and move to interwiki with Bangkalan (Q7107313) that will have geonames 6762269. Kecamatan Burneh (distrikt i Indonesien, lat -7,03, long 112,80) should have interwiki with Burneh (Q7107419). I will have to go back to check if there are others like this. Maundwiki (talk) 17:51, 22 September 2022 (UTC)[reply]

That seems like the right approach, I'm just not familiar enough with GeoNames internals, sorry. -- Danil Satria (talk) 02:58, 23 September 2022 (UTC)[reply]

Kecamatan Maumere/Kecamatan Pembantu Maumere

[edit]

Kecamatan Maumere changed name in geonames to Kecamatan Nelle and I think Kecamatan Pembantu Maumere should have changed name to Kecamatan Palue instead of creating a new geoname record. What does the word Pembantu mean? Google gives Assistant. In any case the two wikipedia articles could change name to Nelle and Palue. Maundwiki (talk) 19:46, 22 September 2022 (UTC)[reply]

Kecamatan Pembantu usually formed to assist the administration of a parent district (Kecamatan Induk) which has an area that is too vast (so pembantu = assistant is kinda right), and if later deemed feasible, it will be legalised as a definitive district (Kecamatan Definitif) equivalent to its parent district.
From Indonesian Wikipedia article about Maumere:
Maumere is the capital of Sikka Regency, East Nusa Tenggara. At this time, the area named "Maumere" no longer exists in the administrative division of Indonesia, but the areas generally considered as "Maumere city" are those included in Kecamatan Alok, Kecamatan Alok Timur, and Kecamatan Alok Barat.
Maumere was first formed as Kecamatan Maumere in 1962. In 1992, Kecamatan Alok separated from Maumere. In 2007, Kecamatan Alok Timur and Alok Barat separated from Kecamatan Alok with the addition of several villages from Kecamatan Maumere. Meanwhile, the remaining Kecamatan Maumere was also split into Kecamatan Koting and Kecamatan Nelle.
Kecamatan Nelle was the last area that officially use the name Kecamatan Maumere before it was changed in 2007, thats why Pembantu Maumere (Q26757929) get Wikimedia duplicated page (Q17362920)of (P642)Nelle (Q12500333). This assumes Geonames 8541636 refers to Maumere before 1962, although I did not find the parent district (Kecamatan Induk) of Kecamatan Pembantu Maumere (Q26757929).
Kecamatan Palue (GeoNames 12433279) more suitable for Palue (Q12502325). -- Danil Satria (talk) 06:39, 23 September 2022 (UTC)[reply]

Thanks

[edit]

Thanks for your work in these administrative code thing. Currently Statistics Indonesia area code (P1588) is under "Statements" section. When you are done with it you can request it to be moved to under "Identifiers" section at Wikidata:Identifier migration/1 or Wikidata:Project chat. I have requested it to be moved before but was rejected because the property has too many constraint violations. Hddty (talk) 04:30, 11 September 2022 (UTC)[reply]

> Thanks for your work in these administrative code thing.
I use Wikidata to store details that we cannot add on OpenStreetMap as tags, hopefully (in long term) to force both project to collaborate more tightly.
> Currently Statistics Indonesia area code (P1588) is under "Statements" section. When you are done with it you can request it to be moved to under "Identifiers" section at Wikidata:Identifier migration/1 or Wikidata:Project chat.
I agree that Indonesian place name in Wikidata need massive clean up, not only P1588 (statistical) but also P2588 (administrative) and (if approved) island code of Indonesia since it is published on the same document as P2588 and P4227 not reliable enough.
Also we need to take care of any subclass of (P279) former administrative territorial entity (Q19953632) based on https://id.wikipedia.org/wiki/Templat:Macam_pembagian_negara that mixed up within 27929 place in Indonesia that maybe need to be merged. Maybe you can recommend any active community in Wikidata to get more in-depth discussion, I've seen Wikidata:WikiProject Indonesia and their telegram group.
> I have requested it to be moved before but was rejected because the property has too many constraint violations.
Can't find it. Thousands of P1588 constraint violation can easily gone if we remove item-requires-statement constraint (Q21503247): coordinate location (P625) from P1588 since BPS (Statistic Indonesia) is not a mapping agency, we have BIG (Badan Informasi Geospasial) for that. Danil Satria (talk) 08:23, 11 September 2022 (UTC)[reply]

Quickstatement for Indonesian local languages in administrative divisions

[edit]

Halo mas @Danil Satria, salam kenal. Jika berkenan, saya hendak meminta bantuannya untuk menggunakan QuickStatement (karena saya belum bisa menggunakannya) untuk menambahkan deskripsi2 pada item2 desa-kelurahan-kecamatan dalam bahasa2 daerah, utamanya Bahasa Bali dan Jawa? Apakah memungkinkan? Terima kasih atas waktunya. Angayubagia (talk) 02:11, 11 December 2022 (UTC)[reply]

Saya pakainya OpenRefine, agak rumit tapi lebih handal untuk data skala besar. Gampangnya bisa pakai spreadsheet macam Microsoft Excel atau LibreOffice Calc, siapkan 3 kolom:
(A) QID (ID Wikidata Desa/Kel/Kec yang ingin diubah);
(B) Djv atau Dban (D=description, jv=java/Bahasa Jawa, ban=bahasa Bali);
(C) Kalimat isi deskripsi.
Nanti setelah tabel 3 kolom tadi di-copy-paste ke form QuickStatement bentuknya kurang lebih:
Q1234{TAB}Djv{TAB}"Deskripsi dalam bahasa Jawa"
Q1234{TAB}Dban{TAB}"Deskripsi dalam bahasa Bali”
Catatan:
- Tiap kolom dipisahkan TAB
- Satu statement per baris
- Kalimat deskripsi diberi tanda kutip
Kalau yang mau diganti berupa label alih-alih deskripsi, kode Djv/Dban diganti Ljv/Lban (L=label).
Oh iya, salam kenal juga. Maaf kalau penjelasannya kurang runut, jika masih ada yang belum jelas, tanya lagi saja. -- Danil Satria (talk) 02:33, 12 December 2022 (UTC)[reply]

Deskripsi kabupaten/kota di Indonesia

[edit]

Mas Danil, saya lihat masnya sering menyunting butir-butir wilayah administratif di Indonesia. Kalau saya boleh minta tolong, tolong bantu menjaga deskripsi kabupaten/kota di Indonesia karena dulu ada pengguna siluman KSDVictory980 (talkcontribslogs) (id:Special:Redirect/logid/16753043; pengguna pengendali siluman: Setyawanary) menghapus massal nama-nama provinsi pada butir-butir kabupaten dan kota di Indonesia dengan fitur Suggested Edits (dilihat dari tag-nya).

Setahu saya nama provinsi itu banyak dicantumkan di setiap butir subdivisinya (Berezne Raion (Q2992459), Lahore Division (Q3308170), Jingzhou (Q71247), Bantayan (Q315771)) dan cukup berguna untuk mengetahui "rangkuman" artikel Wikipedia karena deskripsinya tercantum di bawah judul artikel. Jika masnya berkenan membantu, saya bisa mengirimkan kode sumber yang telah saya himpun untuk batch di QuickStatements melalui surel. Terima kasih banyak. Labdajiwa (talk) 01:50, 11 May 2023 (UTC)[reply]

Iya, saya sekarang sedang fokus data pulau dan wilayah administratif Indonesia. Untuk wil. admin saya lebih banyak mengurusi Dati III dan Dati IV karena paling banyak duplikat dan perubahan data (pindah, dihapus, digabung, berubah status, berganti nama, dll) di tingkat itu, dan juga karena berasumsi 'Dati I dan II sudah banyak yang urus' :-)
Soal deskripsi saya setuju sebaiknya berpola, untuk Dati IV saya menggunakan "{STATUS:desa/kelurahan/gampong/lembang/kampung/pekon/dsb} di [Kabupaten/Kota] {namaDati2}, {namaDati1}" dan jika ada lebih dari 1 Dati IV dengan nama sama dalam satu Kab/Kota saya tambahkan juga "(kecamatan/distrik/kapanewon/dsb) {namaDati3}," sebelum Kabupaten/Kota. Sedangkan untuk English description saya tambahkan ", Indonesia" diujungnya, pola pada teks deskripsi ini sangat membantu untuk mengenali mana yang sudah/belum/perlu digabung ataupun karena konflik sitelink harus ditandai sebagai duplikat.
Nah, untuk deskripsi Dati II yang dimaksud, bagaimana sebaiknya pola yang digunakan? Perlu dipertimbangkan juga Kota yang merangkap ibukota Provinsi.
Kalau pendapat pribadi, untuk Dati II kita juga perlu penyeragaman label, sayangnya Wikidata:WikiProject Administrative Units in Indonesia sudah tidak aktif (terakhir update 2016-08-05T14:32:11) dan anjuran/kesepakatan pelabelannya malah membuat rancu (di Dati II saja sudah ada 60 pasang nama ganda). Sebagian saya lihat labelnya sudah ditambahi Kabupaten/Kota namun belum konsisten, entah berapa banyak Dati III dan IV dibawahnya yang jadi tertukar atau belum terdeteksi ganda karena Dati II-nya pakai label yang sama/tercampur.
Baru beberapa hari lalu saya menemukan/menghapus ratusan P131 pada Dati IV yang sepertinya ingin mencantumkan Provinsi Yogyakarta (Q3741) namun karena diberi label "Daerah Istimewa Yogyakarta" sehingga CEBwiki dan SVwiki malah mencantumkan Yogyakarta (Q7568) (Kota/Dati II). Keduanya salah memang, namun menemukan value/item yang benar menjadi jauh lebih sulit dan menjengkelkan :-(
Kembali ke masalah awal (deskripsi Dati II), kalau yang perlu diganti wilayah tertentu saja, cukup kirim daftar QID-nya (via reply disini cukuplah). Tapi kalau yang mau diseragamkan seluruh Indonesia, saya sudah punya datanya (~115.830 data dari Keputusan Menteri Dalam Negeri Nomor 050-145 Tahun 2022 (Q112136164) termasuk 91.285 wil. admin aktif), tinggal pola kalimat deskripsinya mau dibuat bagaimana (yang merangkap ibukota maupun yang bukan) dan perlu kita pikirkan matang, saya perhatikan itu akan dipakai juga sebagai template untuk deskripsi pada bahasa lain.
Saya bersedia/bisa bantu, toh sekarang saya sedang tahap mengumpulkan peraturan tentang pembentukan wilayah administratif, baik Undang-undang (Provinsi/Kab/Kota) maupun Perda/Qanun (Dati III dan IV) jadi sekalian saja kita rapikan semuanya. -- Danil Satria (talk) 05:50, 11 May 2023 (UTC)[reply]
Saya fokusnya hanya di kabupaten/kota, sih, nggak sampai ke butir kecamatan dan desa. Untuk polanya seperti ini: kabupaten/kota di (nama provinsi), Indonesia, sedangkan untuk ibu kota provinsi: ibu kota Provinsi (nama provinsi), Indonesia. Pola deskripsi kabupaten/kota tidak diberi kata "Provinsi", kata RXerself cukup pakai nama provinsi saja. Entah kenapa, mungkin kata tersebut bersifat redundan? Untuk pola deskripsi ibu kota provinsi, menurut hemat saya bisa ditambahi kata "Provinsi" karena "ibu kota" umumnya merujuk kepada ibu kota negara. Bagaimana menurut Anda? Apakah sudah oke atau ada pendapat lain? Labdajiwa (talk) 07:25, 11 May 2023 (UTC)[reply]
Data ibukotanya sudah saya terima dan lengkapi dari UU pembentukan provinsinya:
Sebenarnya saya inginnya mencantumkan status sebagai Kota/Kabupaten dan sebagai ibu kota (sama seperti Den) pada deskripsi, tapi untuk yang beribukota di "Kota" rasanya kalimatnya jadi aneh. Cuma yang beribukota di Kabupaten bisa disesuaikan menjadi "kabupaten sekaligus ibu kota Provinsi ... , Indonesia".
Deskripsi provinsi ternyata sudah diubah, yang berbeda cuma provinsi-provinsi di Papua ada tambahan "otonom" yang saya tidak mengerti signifikansinya. Kalau memang musti diubah ya silahkan saja bisa manual kok cuma 6 item.
Soal redundansi kata Provinsi pada deskripsi Dati II/III/IV setuju saja sih, karena tidak menimbulkan kerancuan juga karena tidak ada provinsi yang namanya sama. Yang saya kurang setuju adalah dihilangkannya kata Kota/Kabupaten dari label Dati II, ini bisa bikin rancu karena ada 60 Dati II yang labelnya jadi sama.
Untuk saat ini label Dati II belum saya utak-atik, cuma deskripsinya saja (sesuai permintaan). Tapi nanti melihat perkembangan dan kebutuhan (dan kalau ada, pembahasan bersama) bisa jadi akan saya ganti juga, mungkin sekalian dengan penambahan data dari UU pembentukan Provinsi/Kab/Kota yang sedang saya kumpulkan. — Danil Satria (talk) 11:06, 11 May 2023 (UTC)[reply]
Halo, Mas Danil. Sepertinya dia masih membuat suntingan yang tidak membangun. Pola suntingan Special:Contribs/103.28.114.254 mirip dengan suntingan Setyawan Ary. Apakah Mas Danil bisa menjalankan lagi batch untuk deskripsi bahasa Inggris? Yang deskripsi bahasa Indonesia sudah saya jalankan. Trims. Labdajiwa (talk) 12:40, 26 July 2023 (UTC)[reply]
Kebetulan sekarang sedang menyiapkan data dari Keputusan Menteri Dalam Negeri Nomor 100.1.1-6117 Tahun 2022 (Q118697978), nanti sekalian saja semua diupdate… setidaknya dalam minggu ini Dati 1 dan 2 harusnya sudah siap. — Danil Satria (talk) 15:18, 26 July 2023 (UTC)[reply]

Kepulauan Tengah

[edit]

Permisi, minta tolong untuk disatukan butir Greater Islands (Q4117367) dan Greater Islands (Q24829352). Terima kasih banyak. Anhar Karim (talk) 05:49, 17 May 2023 (UTC)[reply]

✓ Done saya belum punya data ihwal kepulauan, baru sebatas pulau (dari Kemendagri). Terima kasih informasinya, dan salam kenal :-) -- Danil Satria (talk) 11:27, 17 May 2023 (UTC)[reply]
Terima kasih atas bantuannya. Terus terang, saya masih awam tentang Wikidata dan perlu mempelajarinya lebih lanjut. Sebenarnya masih banyak pulau-pulau yang pernah saya temukan sebelumnya dengan butir-butir yang berbeda, tapi entitas yang sama. Iya, untuk data kepulauan masih sangat minim untuk data milik pemerintah, seperti di KKP, BIG, Kemendagri, dan lain-lain. Anhar Karim (talk) 11:47, 17 May 2023 (UTC)[reply]
Untuk pulau memang masih banyak pekerjaan, kadang koordinat Indonesian Small Islands Directory ID (P4227), island code of Indonesia (P11163) dan GeoNames ID (P1566) tercampur. Makanya saya harus menggunakan aplikasi GIS untuk merapikannya.
Sementara kepulauan kesulitannya lebih karena belum ada data resmi, saya banyak menggunakan peta lama di https://www.oldmapsonline.org/compare
meski kadang nama yang dipakai tiap peta berbeda, tinggal kitanya pintar-pintar memilah peta mana yang diacu. -- Danil Satria (talk) 12:34, 17 May 2023 (UTC)[reply]

Dobel butir

[edit]

1. Pulau Pelokan Lompo Pelokang Islands (Q24847198) dan Pelokan Lompo Island (Q115403895), 2. Pulau Pelokan Caddi Pelokan Caddi Island (Q115403894) dan Togo-Togo Pelokan Island (Q115403923). Anhar Karim (talk) 15:53, 20 May 2023 (UTC)[reply]

Maaf balasnya agak lambat, baru bisa akses komputer (karena harus bandingkan koordinat dengan data lain juga). Ternyata disekitar lokasi pulau-pulau tersebut pada peta lama seperti peta Makassar oleh U.S. Army Map Service, 1944 dan Postiljon Eilanden dari Survey of India, 1945, senada dengan data Geonames 1631654 dimana semua pulau di 7°11′20″S, 118°23′47″E memang dicantumkan/dianggap sebagai satu entity Kepulauan Pelokang (Q24847198) saja.
Sementara data baru seperti dalam Keputusan Menteri Dalam Negeri Nomor 050-145 Tahun 2022 (Q112136164), Direktori Pulau-Pulau Kecil Indonesia (Q30276731), dan Sistem Informasi Nama Rupa Bumi dari Badan Informasi Geospasial memisahkannya menjadi:
  1. Pulau Pelokan Lompo (Q115403895) 73.10.40078, BIG 154946, DPKI 8113
  2. Pulau Pelokan Caddi (Q115403894) 73.10.40077, BIG 154945, DPKI 8112
  3. Pulau Togo-Togo Pelokan (Q115403923) 73.10.40132, BIG 155299
  4. Pulau Karangan Barat Pelokan (Q115403867) 73.10.40033, BIG 155298
  5. Pulau Karangan Utara Pelokan (Q115403880) 73.10.40046, BIG 155300
Pendapat saya, Kepulauan Pelokang (Q24847198) lebih tepat disebut island group (Q1402592) (atau atoll (Q42523) namun sepertinya bukan archipelago (Q33837)) sebagai nama kolektif untuk kelima pulau di atas, dihubungkan dengan has part(s) (P527) dan part of (P361).
Nantinya di OpenStreetMap (peta yang dipakai Wikimedia) juga akan bisa dipetakan kelima pulau dengan place=island/islet dan digabung dengan role=outer pada satu relasi type=multipolygon Kepulauan Pelokang (Q24847198).
Dengan begitu keenam data terakomodasi karena semuanya valid (punya sumber/referensi tersendiri) dan tidak ada referensi yang mendukung klaim untuk menggabungkan. Tidak bisa kita gabungkan/hilangkan salah satu, malah nanti akan dimunculkan lagi (oleh editor lain) dan jadi duplikat. -- Danil Satria (talk) 04:40, 22 May 2023 (UTC)[reply]
Ok. Terima kasih mas Danil Satria atas penjelasan dan informasinya, perkiraan saya ternyata salah bahwa Pulau Pelokan berdiri sendiri yang mencakup 5 pulau tersebut dan begitu pula persepsi saya dengan kata "caddi" dan "togo-togo" bermakna sama dalam bahasa setempat yang artinya kecil, hal inilah yang membuat saya berkesimpulan entitas yang sama. Anhar Karim (talk) 05:10, 22 May 2023 (UTC)[reply]
Kebetulan ada yang punya pengetahuan setempat, di sekitar situ ada beberapa pulau yang secara koordinat merujuk pulau yang sama (dilihat dari citra satelit) tapi beda nama antara data Kemendagri dan Geonames:
  1. Pulau Sapiriah (Q115403911) dengan Pulau Sapiriah (Q24827988)
  2. Pulau Sanipa (Q115403910) dengan Pulau Sanipa (Q25015781)
  3. Pulau Balobaloang Lompo (Q115403849) dengan Pulau Sumanga (Q24830419)
  4. Pulau Balobaloang Caddi (Q115403848) dengan Pulau Balobaloang Caddi (Q25015852)
  5. Pulau Sumanga (Q115403922) dengan Pulau Balobaloang Lompo (Q25012170)
Kalau bisa, mohon dikonfirmasi mana yang memang sama. --- Danil Satria (talk) 05:39, 22 May 2023 (UTC)[reply]
Setelah saya cek kembali di situs PPK KKP, Google Maps dan sumber peta lama, menurut saya:
  1. Pulau Sapiriah (Q115403911) dengan Pulau Sapiriah (Q24827988) = Entitas yang sama hanya berbeda penamaan saja, bahasa orang pulau disana dengan penamaan identik dengan prefiks sa-
  2. Pulau Sanipa (Q115403910) dengan Pulau Sanipa (Q25015781) = Entitas yang sama hanya berbeda penamaan saja, bahasa orang pulau disana dengan penamaan identik dengan prefiks sa-
  3. Pulau Balobaloang Lompo (Q115403849) dengan Pulau Sumanga (Q24830419) = sesuai titik koordinat ini seharusnya bernama Pulau Sumanga (dari bahasa Makassar "Semangat/Perjuangan")
  4. Pulau Balobaloang Caddi (Q115403848) dengan Pulau Balobaloang Caddi (Q25015852) = sudah tepat, nama tidak jadi masalah karena penamaan menyesuaikan nama bahasa daerah (bahasa Makassar) dan yang satunya bahasa Indonesia
  5. Pulau Sumanga (Q115403922) dengan Pulau Balobaloang Lompo (Q25012170) = sesuai titik koordinat ini seharusnya bernama Pulau Balobaloang Besar atau Balobaloang Lompo (dari bahasa Makassar "Bercorak Besar"), jadi keliru dinamakan Pulau Sumanga. Anhar Karim (talk) 23:30, 22 May 2023 (UTC)[reply]
    Sapiriah, Sanipa, dan Balobaloang Caddi sudah digabung sesuai konfirmasi.
    Soal Pulau Sumanga agak menarik, yang dikatakan keliru itu justru adalah data Kemendagri (yang diklaim berasal dari Gazeter Nasional Tahun 2021), Sinar BIG dengan status data sudah penetapan (sudah dicek juga dengan Gazeter Republik Indonesia Edisi 1 Tahun 2022 (hal 991), dan DPKInya KKP yang menjadi rujukan di artikel Wikipedia Indonesia. Meskipun hampir semua peta lama menyatakan lokasinya salah:
    1. Postiljon Eilanden, 1945
    2. Makassar / U.S. Army Map Service, 1944
    3. Makassar / Topografische Inrichting, 1942
    4. Makassar / Topografische Inrichting, 1926
    Sekarang saya anggap posisi Pulau Balobaloang Lompo (Q115403849) dengan Pulau Sumanga (Q115403922) koordinatnya terbalik sesuai yang dikonfirmasi diatas, kasus seperti ini juga bukan kali pertama saya temui selama menginput data island code of Indonesia (P11163).
    Dengan kata lain Pulau Balobaloang Lompo (Q115403849) akan digabung Pulau Balobaloang Besar (Q25012170) sementara Pulau Sumanga (Q115403922) digabung dengan Pulau Sumanga (Q24830419).
    Soalnya setelah saya cek sekali lagi, data Kemendagri yang berasal dari BIG tercatat berasal dari Survei KKP 2006-2012, jadi sepertinya semua berasal dari satu sumber. -- Danil Satria (talk) 11:30, 24 May 2023 (UTC)[reply]
Terima kasih telah memperbaikinya.
Benar sekali, data Survei KKP 2006-2012 menjadi pijakan mereka walaupun ada tambahan-tambahan untuk data yang luput dari inputan karena informasi baru. Seperti titik koordinatnya, hampir semua persis angka-angkanya. Walaupun begitu terkadang juga menemukan data yang keliru saat kroscek dengan sumber lain.
Iya, secara logika saja 3 pulau tersebut berturut-turut dari utara ke selatan Pulau Balobaloang Besar ⇒ Pulau Balobaloang Kecil ⇒ Pulau Sumanga. Anhar Karim (talk) 16:03, 24 May 2023 (UTC)[reply]

Nilai untuk properti "adalah" pada butir mengenai Sekolah di Indonesia

[edit]

Halo @Danil Satria, terima kasih sudah memperbarui butir yang saya sunting: SD Negeri Kebon Sirih 4, ada beberapa hal menarik yang ingin saya tanyakan. Pada Wikidata saya menemukan satu butir mengenai Sekolah Dasar Negeri.

1. Mengapa butir tersebut tidak dipakai sebagai nilai untuk properti "adalah" pada butir mengenai sekolah dasar negeri?

2. Apakah penggunaan butir sekolah negeri dan sekolah dasar merupakan aturan/kebiasaan umum dalam butir mengenai topik pendidikan di Indonesia?

3. Apakah pengaturan demikian berpengaruh pada kueri?

4. Bila saya ekstrapolasi, apakah untuk nilai properti P13 pada butir sekolah menengah atas negeri, dipisah: "sekolah menengah atas" dan "sekolah negeri"?

Terima kasih, mohon maaf bila kurang berkenan. HA (talk) 08:38, 14 December 2023 (UTC)[reply]

Menurut saya akan tidak efisien jika memakai SD N (Q109661193), SMP N (Q109661206), SMP S (Q109661207) dan sejenisnya, karena nanti akan terlalu banyak jenis sekolah Indonesia. Lihat saja di User:Danil_Satria/Sekolah sebagai gambaran situasi saat ini.
1. Karena tidak ada sumber maupun dasar hukumnya, peraturan/keputusan menteri hanya membagi jenis dan tingkatan, misal:
(1) Pendidikan anak usia dini berbentuk Raudhatul Athfal (RA) (Q7415634).
(2) Pendidikan dasar berbentuk MI (Q12496022) dan MTs (Q12496021).
(3) Pendidikan menengah berbentuk MA (Q12496019) dan MAK (Q12496020).
  • Peraturan Menteri Agama Nomor 7 Tahun 2012 (Q123509569) tentang Pendidikan Keagamaan Kristen, Pasal 4: Pendidikan Keagamaan Kristen formal terdiri atas:
a. Sekolah Dasar Teologi Kristen (SDTK);
b. Sekolah Menengah Pertama Teologi Kristen (SMPTK); dan
c. Sekolah Menengah Teologi Kristen (SMTK) Q110102175 / Sekolah Menengah Agama Kristen (SMAK) Q123509842.
  • Peraturan Menteri Agama Nomor 1 Tahun 2013 (Q123510058) tentang Sekolah Menengah Agama Katolik (Q123510092) disingkat SMAK juga tapi di wikidata saya ikuti versi referensi.data.kemdikbud.go.id menjadi SMAg.K
Belum lagi pembagian pendidikan agama Budha (Dhammasekha) dari Nava, Mula, Muda, Uttama, dan Uttama Kejuruan serta agama Hindu: Pratama/Adi/Madyama/Utama/Maha Widya Pasraman.
2. Saya sendiri bukan insan pendidikan sehingga tidak bisa menyatakan itu aturan/kebiasaan. Kalau kita cek NPSN SD Negeri Kebon Sirih 4, Kemendikbudristek mencantumkan
Status Sekolah: NEGERI
Bentuk Pendidikan: SD
Kalau sumber resmi memisahkan keduanya, saya ikut saja :)
Setahu saya Wikidata tidak membedakan properti untuk Status dan Bentuk, beberapa kali saya temui ada yang memakai has use (P366) untuk status (negeri/swasta) tapi rasanya kurang pas. Kalaupun nanti sudah ada property khusus untuk salah satunya tinggal kita pindahkan dari instance of (P31).
3. Tentu, semakin banyak subclass akan semakin panjang dan rumit pembuatan query. Dan semakin banyak juga halaman subclass jenis pendidikan yang harus dipantau, misalnya jika ada perubahan peraturan, item sekolah salah/pindah kategori atau sekedar vandalisme.
4. Ya, sebaiknya dipisah atau setidaknya harus konsisten, kalaupun ingin pakai SDN/SDS, SMPN/SMPS maka kita harus membuatkan dan merawat juga halaman/item untuk SMAN/SMAS, SMKN/SMKS, MAN/MAS, MAKN/MAKS, MTsN/MTsS, MIN/MIS, SMAg.KN/SMAg.KS, SMAKN/SMAKS, SMTKN/SMTKS, SLBN/SLBS, SDLBN/SDLBS, SMPLBN/SMPLBS, SMLBN/SMLB dan lain-lain.
Sama-sama, tidak ada keberatan. Karena ini juga berguna jika nanti ada pertanyaan serupa. -- Danil Satria (talk) 10:56, 14 December 2023 (UTC)[reply]

Terima kasih!

[edit]

Mas Danil, terima kasih atas sumbangsihnya untuk data wilayah dan pulau di Indonesia. Salam kenal juga Biyanto R (talk) 22:24, 16 December 2023 (UTC)[reply]

untuk wilayah administratif dan pulau kalau berdasarkan data Kemendagri sudah lengkap di Wikidata, namun saya kesulitan untuk me-merge item geografis Indonesia yang diimport dari svwiki karena kendala bahasa, mohon bantuannya ya :)
-- Danil Satria (talk) 09:33, 17 December 2023 (UTC)[reply]

Places in Indonesia

[edit]

You imported a number of places from Geonames. To prevent importing duplicates you may want to first clean up (adding sitelinks or creating items) these first. GZWDer (talk) 12:14, 1 August 2024 (UTC)[reply]

thanks for the reminder, so far I have used GeoNames and Wikidata coordinates to prevent duplicates via JOSM and OpenRefine. The next step, using a combination of Label and Description... I haven't used cebwiki yet because that's where most of the duplicates I've encountered so far come from. Danil Satria (talk) 02:32, 2 August 2024 (UTC)[reply]