Today, May 4, the day of Star Wars, we have prepared for you a detailed guide on the main functions of the library dplyr
. Why on Star Wars Day? And because we will disassemble everything using the example of a dataset starwars
.
Let `s start!
, , 4 ? - «May the fource be with you» «May, the 4th», .. 4 .
, dplyr
. library
.
starwars
. , .
name - . , - .
height -
mass -
hair_color -
skin_color -
eye_color -
birth_year - ( )
sex - ( )
gender - (, )
homeworld -
species -
films - ,
vehicles - ,
starships - ,
, , dplyr
. - .
dplyr
dplyr
- tidyverse
. Python
- Pandas
. dplyr
: , , .
dplyr
SQL
. Netpeak :dplyr
SQL
. .
, dplyr
, , tidyverse
tidy data
. , « » - :
, starwars
. , , ?
, dplyr
. dplyr
:
- !
- ? - select
.
, ? , 20 , . . - :
select
. , . :
contains
:
ends_with
:
matches
:
num_range
: , , «V1, V2, V3...»
one_of
:
starts_with
:
. , «»
.
, .
- : , «t»
, 1 .
, tibble
. , , ? , dplyr
pull
. , dplyr
, .
, . , .
- WHERE
SQL
. dplyr
filter
( , ?).
filter
- , True
. :
&
|
:
>/<
, :
>=
<=
is.na
!is.na
%in%
!
:
filter
, . , distinct
.
, sample_n
, n
.
slice
, , :
sample_frac
. , . , 0.5
, .
, , .
SQL
ORDER BY
. dplyr
arrange
.
- , .
, desc
.
? , :)
, arrange
, select
. across
. :
, , - .
- , - , , , . SQL
GROUP BY
sum, min, max
. dplyr
… . , .
eye_color
:
15 , . - - . summarise
.
, , , , .
, drop_na
tidyr
, , . , // NA
.
4 ( , ) . , summarise
.
, , :
n_distinct
-
last
-
nth
- n-
quantile
-
IQR
- , inter-quartile range
mad
- , median absolute deviation
sd
-
var
-
, . …
- , , ? , mass
, mass
height
.
- across
.
, mass
height
. , («_»), .names
.
, .
- . , A
, B
, A/B
. - . mutate.
- .
? - across
. , 10 . - .
, , 10.
- «_new»
. stringr
tidyverse
.
, _new
. , .
, mutate
SQL
. , mass
dense_rank
:
, rnk
.
, , . :
lag
lead
cumsum
dense_rank
ntile
row_number
case_when
coalesce
, , 100% SQL
. .
, - .
, - . dplyr
SQL
. :
left_join
right_join
inner_join
full_join
SQL
. - starwars
, - . .
, rename name. ,
by
(ON
SQL
) .
inner_join
, 35 , .. df
35. by
, .
full_join
, 87, .. starwars
87.
, . ..
mass
, ,.x
.y
. ?
1 :
, by
. , . - - new_name
.
2 :
, , , , . , :
, dplyr
:
bind_rows
- «»
bind_cols
- «»
intersect
-
setdiff
- , .. ,
union
- , ( )
union_all
-union
,
dplyr
. - , . - :)
, . May the fource be with you!