The Archived Web Dataset
- Datum
- 24.11.2021
- Zeit
- 10:45 - 12:15
- Sprecher
- Helge Holzmann
- Zugehörigkeit
- Internet Archive (archive.org)
- Sprache
- en
- Hauptthema
- Informatik
- Host
- ScaDS.AI Dresden/Leipzig
- Beschreibung
- The Internet Archive (IA) has been archiving broad portions of the global web for 25 years. This historical dataset offers unparalleled insight into how the web has evolved over time. Part of this collecting effort has included the ability to support large-scale computational research efforts analyzing this enormous dataset. Web archives give us the opportunity to process the web as if it was a dataset, which can be searched, analyzed and studied, temporally as well as retrospectively. Our engineering efforts address the very specific traits of the archived web for our interdisciplinary users and partners, by hiding all the complexity and abstract away technical details. This talk will outline different perspectives on computational research of archived web data, along with technical challenges, novel developments and opportunities as well as considerations to make when working with this unique dataset.
- Links
Letztmalig verändert: 22.11.2021, 19:00:56
Veranstaltungsort
Online, please follow the internet link. (https://events.scads.ai/event/4/)
Veranstalter
Center for Information Services and High Performance ComputingZellescher Weg12-1401069Dresden
- Telefon
- +49 351 463-35450
- Fax
- +49 351 463-37773
- TUD ZIH
- Homepage
- http://tu-dresden.de/zih
Legende
- Ausgründung/Transfer
- Bauing., Architektur
- Biologie
- Chemie
- Elektro- u. Informationstechnik
- für Schüler:innen
- Gesellschaft, Philos., Erzieh.
- Informatik
- Jura
- Maschinenwesen
- Materialien
- Mathematik
- Medizin
- Physik
- Psychologie
- Sprache, Literatur und Kultur
- Umwelt
- Verkehr
- Weiterbildung
- Willkommen
- Wirtschaft
