Crawling status: Tue, 27 Oct 2020
Data not updated
HTML has changed
Hasn't been resolved
After we finalised the first phase of our baseline work, we realised that we needed to continue looking deeper into what content is included within each site, how frequently the site is updated, what are the recurring issues that we’re facing in terms of data accessing and ingestion. The goal is that this second phase would enhance our understanding on what baseline data on COVID-19 is available and their accessibility on provincial websites.
Within the dashboard you will be able to see the key figures related to how many provincial websites are available or can be successfully accessed. Then from those numbers, it is possible to assess how many websites update their data (mostly related to new cases and casualties); which websites are not updated; which ones have changes in terms of site or HTML structure; as well as how much time is needed for us to address those changes in a practical way to resume retrieval of information from a site. In addition, we can look into which sites' retrieval process have failed due to other factors.
COVID-19 has affected all 34 provinces in Indonesia. There is a system in place to run the data retrieval process on a daily basis, concluded each day at 7PM (Jakarta time).
We will continue monitoring the data on the provincial websites over the next few months to conduct further analysis and determine the possibility of combining the available data sets with the additional ones to fill information gaps around COVID-19 response.