Library/Computer and Information Sciences/Python for Data Analysis: Data Wrangling with pandas, NumPy & Jupyter/Data Wrangling: Join, Combine, and Reshape

Question 39 of 50

Take a quiz Listen to a podcast

When is it appropriate to use `left_on` and `right_on` arguments in `pandas.merge`?

Correct answer: When the join key column names are different in the left and right DataFrames.

Explanation

`left_on` and `right_on` provide the flexibility to join two DataFrames even when the columns you want to join on do not share the same name, by specifying the key column for each DataFrame individually.

Back to chapter overview

Previous Next

Other questions

Question 1

What is the primary function of the `unstack` method when applied to a hierarchically indexed Series in pandas?

Question 2

Which method is the inverse operation of `unstack` in pandas, used for pivoting columns into rows?

Question 3

When using `pandas.merge` without specifying the `on` or `how` arguments, what is the default behavior?

Question 4

What does the `set_index` function on a DataFrame accomplish?

Question 5

When performing a many-to-many merge in pandas, how is the resulting number of rows determined for matching keys?

Question 6

What is the purpose of the `suffixes` argument in the `pandas.merge` function?

Question 7

What is the primary difference between the DataFrame's `join` method and `pandas.merge`?

Question 8

What does the `combine_first` method do when used on two pandas Series or DataFrames?

Question 9

When using `pandas.concat` to combine several Series objects with `axis="columns"`, what do the `keys` provided in the `keys` argument become in the resulting DataFrame?

Question 10

What is the effect of passing `ignore_index=True` to the `pandas.concat` function?

Question 11

What is the purpose of the DataFrame `pivot` method?

Question 12

Which method is described as the inverse operation to `pivot` for DataFrames, transforming data from a wide to a long format?

Question 13

In the `pandas.melt` function, what is the purpose of the `id_vars` argument?

Question 14

When sorting a hierarchically indexed object, what is the significance of the index being lexicographically sorted?

Question 15

How can you aggregate a DataFrame by a specific index level for summary statistics?

Question 16

What is the result of applying the `stack` method to the DataFrame created by `data.unstack()` in the code snippet `data = pd.Series([0.9, 0.2, 0.6, 0.7], index=[['a', 'a', 'b', 'b'],[1, 2, 1, 2]])`?

Question 17

When merging two DataFrames, `df1` and `df2`, on a key that results in a many-to-one join, how are the index values of the output DataFrame determined by default?

Question 18

Which `how` argument value in `pandas.merge` will result in a DataFrame containing the union of keys from both input DataFrames?

Question 19

If you perform an outer join on `df1` (with key 'c') and `df2` (with key 'd'), what values will appear in the columns corresponding to the non-matching DataFrame?

Question 20

To merge a DataFrame `lefth` with columns `key1`, `key2` and a DataFrame `righth` with a hierarchical index, how must you specify the join keys?

Question 21

In a DataFrame `frame` with a MultiIndex on the rows with levels named `key1` and `key2`, what does the method `frame.swaplevel("key1", "key2")` do?

Question 22

If you concatenate two DataFrames with overlapping row indexes but different columns using `pd.concat([df1, df2], axis="columns")`, what is the outcome for rows that exist in one DataFrame but not the other?

Question 23

When using `pandas.pivot` to reshape a DataFrame, if the specified `index` and `columns` arguments result in multiple values for a given cell, what happens?

Question 24

What is a key difference between `stack` and `melt`?

Question 25

Consider a DataFrame `df` with columns A, B, C, D. What is the result of `pd.melt(df, id_vars=['A'], value_vars=['B', 'C'])`?

Question 26

By default, the `unstack` method pivots which level of a MultiIndex?

Question 27

If `unstacking` a level in a DataFrame results in some subgroups not having all the values present in that level, what does pandas introduce into the resulting DataFrame?

Question 28

Consider the code: `df1 = pd.DataFrame({'key': ['b', 'b', 'a'], 'data1': [0, 1, 2]})` and `df2 = pd.DataFrame({'key': ['a', 'b', 'd'], 'data2': [0, 1, 2]})`. What is the number of rows in the output of `pd.merge(df1, df2, how='left')`?

Question 29

What is the primary data structure returned by `pd.MultiIndex.from_arrays`?

Question 30

If `frame` has a hierarchical index on its columns with levels named `state` and `color`, how would you select all columns under the `Ohio` state?

Question 31

What is the key difference in output between `stack()` and `stack(dropna=False)`?

Question 32

If a DataFrame `df` has columns `lkey` and `rkey` used for merging with `pd.merge(df3, df4, left_on="lkey", right_on="rkey")`, what happens to the `rkey` column in the output?

Question 33

What are the three fundamental data combination operations in pandas mentioned at the beginning of Section 8.2?

Question 34

If `left2` and `right2` are DataFrames with different columns but partially overlapping indexes, what is the result of `left2.join(right2, how="outer")`?

Question 35

When using `pd.concat`, how can you create a hierarchical index on the concatenation axis to identify the original pieces of data?

Question 36

Consider the DataFrame `data` created in In [126]. What is the shape of the output of `result = data.stack()`?

Question 37

If `long_data` is a DataFrame in long format with columns `date`, `item`, and `value`, what does the code `pivoted = long_data.pivot(index="date", columns="item", values="value")` produce?

Question 38

What is the key difference between the default behavior of `set_index` and `set_index(drop=False)`?

Question 40

If `left` has `key1`='foo', `key2`='one' with `lval`=1 and `right` has `key1`='foo', `key2`='one' with `rval`=4 and another row with `key1`='foo', `key2`='one' with `rval`=5, what is the number of rows in the output of `pd.merge(left, right, on=["key1", "key2"], how="inner")`?

Question 41

When merging `left1` on column 'key' and `right1` on its index, what arguments should be passed to `pd.merge`?

Question 42

Consider the numpy array `arr = np.arange(12).reshape((3, 4))`. What is the shape of the output of `np.concatenate([arr, arr], axis=1)`?

Question 43

If `s1` is a Series with index ['a', 'b'] and `s4` is a Series with index ['a', 'b', 'f', 'g'], what happens to the 'f' and 'g' labels in the output of `pd.concat([s1, s4], axis="columns", join="inner")`?

Question 44

You have a list of DataFrames `[df1, df2]` where the row index does not contain relevant data. Which combination of arguments to `pd.concat` will combine them vertically and create a new, continuous integer index?

Question 45

If `a` and `b` are two Series with overlapping indexes and some null values, how does `a.combine_first(b)` determine the values in the resulting Series?

Question 46

If `df1` has a value of 1.0 in column 'a' at index 0, and `df2` has a value of 5.0 in column 'a' at index 0, what will be the value in column 'a' at index 0 of the result of `df1.combine_first(df2)`?

Question 47

When reshaping a DataFrame using `df.unstack(level="state")`, what does the unstacked level become in the resulting DataFrame's structure?

Question 48

If `long_data.pivot()` is called without the `values` argument, and there are multiple potential value columns, what is the structure of the resulting DataFrame?

Question 49

When is `pandas.melt` particularly useful without specifying any `id_vars`?

Question 50

A DataFrame `frame` is created with a hierarchical index with `key1` and `key2`, and columns with `state` and `color`. Given `frame.index.names = ["key1", "key2"]`, how many levels does the row index have?