This course focuses on the concept of “Big data” and studies modern techniques and storage platforms for their management at Internet scale. Specifically, during this course we will study: Large-scale system architectures: Peer-to-Peer and Cloud Computing. Databases on the Internet: Relational, parallel and distributed databases, with emphasis on distributed file system technologies (HDFS), NoSQL (HBase, Cassandra), graph-databases (Neo4j), NewSQL. Execution models over large amounts of data (MapReduce, BSP) and platforms that implement them (Hadoop, Hama, Spark, etc.). Applications of the above and distributed algorithm implementation.