QUICK LINKS

Title: Audio Information Retrieval
Instructors: Prof. Dr. Rainer Lienhart, Gregor van den Boogaart
Synopsis: Audio information retrieval (AIR) deals with the problem of automatically deriving higher information from an audio signal by directly processing the content of the signal. Typical applications are:
  • Classifying sounds (e.g. silence, applause, speech, music)
  • Artist or genre recognition
  • Music recommendation
  • Speech and speaker recognition

An AIR system normally uses techniques from signal processing and psychoacoustics combined with techniques from machine learning. The design is strongly driven by knowledge from the specific field of application (e.g. linguistics, music).

The lecture introduces the underlying techniques of AIR. Outline:

  • Digital signal processing (DSP)
  • Advanced applications of DSP
  • Basics for machine learning on audio signals (features, techniques, systems)
  • Applications of AIR
Language: The lecture will be held in German, but most of the literature will be English.
Time:
  • Mo: 12:15-13:45 (V/lec); Room 202, Eichleitnerstraße 30
  • Mo: 14:00-15:30 (Ü/ex); Room 202, Eichleitnerstraße 30
  • The course starts on monday 13. October 2008 at 12:15 a.m. with the lecture. The start of the exercises is announced in the lecture.
Registration: Closed. Was handled via LectureReg.
Exam: Written exam at end of course.
Credits: 2+1 SWS, Schein: yes, 4 LP
Multimedia Teilbereiche: Multimedia-Methoden, Systemnahe Grundlagen von Multimedia
Prerequisites: Intended for 5th and higher semesters. Knowledge in linear algebra and analysis required, knowledge in statistics and machine learning useful.
Related Courses:
  • The Seminar Audiosignalverarbeitung is also held in this winter term.
  • The lecture AIR is not solely but also intended as preparation for the upcoming course "Multimedia Praktikum (Audio)" in summer term 2009.

News

  • The lecture has been moved from friday to monday. Therefore it already starts on monday, 13. October 2008.

Online Material

Slides

Chapter 4 slides per page
1 Introduction AIR_chap01_p4.pdf
2 Mathematical and other basics AIR_chap02_p4.pdf
3 Digital Signals AIR_chap03_p4.pdf
4 Digital Systems AIR_chap04_p4.pdf
5 Random Signals AIR_chap05_p4.pdf
6 Short Time Fourier Transforms AIR_chap06_p4.pdf
7 Applications of DSP AIR_chap07_p4.pdf
8 Machine Learning Basics AIR_chap08_p4.pdf
9 Features AIR_chap09_p4.pdf

Resources

Resource File, Download
en - de dictionary AIR_dict.pdf

Homework, Exercises

Due date Task
Get ASAP. Oppenheim, A. V., Schafer, R. W., and Buck, J. R. Discrete-time signal processing. Prentice-Hall, Inc., 2nd edition. 1999
27.10.2008
  • Read chapters 2.1, 4.1 - 4.3, 4.8 from Oppenheim et al, 1999 (preparation for lecture 3).
  • Exercise, Sheet 1: PDF
03.11.2008
  • Read chapters 2.2 2.4, 2.6 - 2.9, 5.1 from Oppenheim et al, 1999 (preparation for lecture 4).
  • Exercise, Sheet 2: PDF
17.11.2008
  • Read chapters 2.10, A.1-A.2, A.4 from Oppenheim et al, 1999 (preparation for lecture 5).
  • Read chapters 8.5 - 8.8, 10.1 - 10.5, 7.2 from Oppenheim et al, 1999 (preparation for lecture 6).
  • Exercise, Sheet 3: PDF
24.11.2008
  • Exercise, Sheet 4: PDF
01.12.2008
  • Read Vikas Raykar, Igor Kozintsev, Rainer Lienhart. Position Calibration of Microphones and Loudspeakers in Distributed Computing Platforms. IEEE Transactions on Speech and Audio Processing, Vo. 13, No. 1, pp. 70-83, Jan. 2005. PDF alt.: PDF
  • Exercise, Sheet 5: PDF
05.12.2008 Read:
  • Jonathan Foote. An Overview of Audio Information Retrieval. Multimedia Systems. Vo. 7, No. 1, pp. 2-10, 1999. PDF
  • Elena Ranguelova and Mark Huiskes. Pattern Recognition for Multimedia Content Analysis, in: "Blanken, Henk M. and Blok, Henk Ernst and Feng, Ling and Vries, de Arjen P., Multimedia Retrieval." Springer Verlag, pp. 53-95, 2007
  • Beth Logan. Mel frequency cepstral coefficients for music modeling. Proceedings of the First International Symposium on Music Information Retrieval (ISMIR), 2000. PDF