Programming Bootcamp for Scientists #41

Bootcamp
  • 00:00 Pandas - Python Data Analysis Library

  • 01:10 Datasets in use (Planets, Stack Overflow Survey data)

  • 05:48 Read CSV file into DataFrame

  • 11:05 DataFrame Statistics

  • 14:35 Show first and last rows

  • 15:37 DataFrame select columns

  • 17:37 DataFrame select rows (iloc)

  • 18:37 DataFrame select rows and columns

  • 19:05 DataFrame filter rows by size

  • 20:25 DataFrame filter rows by value

  • 21:17 Filter elementwise boolean and

  • 23:34 sort (sort_values)

  • 24:09 loc vs. iloc

  • 28:03 Add calculated column, remove, delete (drop) column

  • 30:11 Calculation (gravitational force)

  • 36:03 Read CSV set index column

  • 36:50 Count values

  • 38:35 Select top items (head, tail)

  • 40:24 Pandas Show histogram

  • 42:13 Pandas read selected columns

  • 44:30 Read files in chunks

  • 49:24 Combine columns using any function (apply, vectorize)

  • 51:24 Read Excel files (read_excel) (to_excel)

  • 52:00 Create Excel file for experiment with random data

  • 55:14 Calculate Genome metrics

  • 57:25 Exercise: Pandas (incl. Kaggle)

Author

Gabor Szabo (szabgab)

Gabor Szabo, the author of the Python Maven web site.

Gabor Szabo