I think there are not many people left who have not heard about hackathons and Data Science competitions. I heard about them six months ago. Participating in everything that I saw (and even winning something), I could not pass by AgroCode 2020, organized by the Russian Agricultural Bank. I managed to get into the top of the best participants in several directions, and in one direction I took a prize at all. Thanks to these achievements, I became a Data Science specialist at the Center for the Development of Financial Technologies of the Russian Agricultural Bank. And how I did it - read below.
The main agro-coding of the country
To begin with, I will say a few words about the event itself. AgroCode 2020 brought together many people who are not indifferent to new technologies in agriculture. It consisted of several activities:
Agro Data Science Cup data analysis competition with 2 tasks:
.
Agro Hack 6 :
.
, , . 10 .
Agro Idea, .
, , . , , , . . DS- . -10, - 2 !
. 17 .
?
: , ID , , ( ) 365 .
F1- sklearn ( average="weighted").
, . : , . .
? ?
, , NDVI โ
, 4 : RGB . , RED โ , NIR โ .
?
-, 45 , 279 . : - , () , .
-, , - ( - ). , .
-, . , , . .
. , , . - , , . , .
โฆ , , . , . .
ID, . . - . . : 2 4 .
?
, . - KFold StratifiedKFold , . . , . , . -. .
, , CatBoost. , , , :
params = {
'iterations': 2000,
'depth': 6,
'early_stopping_rounds': 500,
'l2_leaf_reg': 5,
'bagging_temperature': 1,
'random_seed': 17,
'class_names': classes,
'auto_class_weights': 'Balanced',
'eval_metric': 'TotalF1',
'loss_function': 'MultiClassOneVsAll',
'task_type': 'GPU',
'devices': '0:1',
'verbose': 2000
}
โBalancedโ โMultiClassOneVsAllโ. . . , , , random_seed . - . , , . , , .
18 . , , . , 18 2 . , - . โ .
: . . , , , .
( ) . , .
. โ 1056 1056 .jpg. . , , . . : https://www.kaggle.com/maciejadamiak/lemons-quality-controldataset
ROC-AUC. :
def score(y_true, y_preds):
table = y_true.merge(y_preds, left_on='image_id', right_on='image_id')
m = keras.metrics.AUC(curve='ROC')
m.update_state(table.iloc[:, 1:10], table.iloc[:, 10:])
return m.result().numpy()
.
csv-. .py , -. , . , .
20 . , . ? .
, , . , . , , , , .
. , .
aug = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.1,
height_shift_range=0.1,
brightness_range=[0.5, 1],
shear_range=0.2,
channel_shift_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
vertical_flip=True,
fill_mode="nearest"
)
?
, . backbone VGG16, AveragePooling, (Dense) Dropout.
model = VGG16(weights=None, include_top=False, input_shape=[image_size, image_size, 3])
x = AveragePooling2D(pool_size=(2, 2))(model.output)
x = Flatten()(x)
x = Dense(256, activation='relu')(x)
x = Dropout(0.5)(x)
output = Dense(9, activation='sigmoid')(x)
, , .
KFold -. , , .
9 , .
. โ โ . , . .
:
, , .
โ .
( , , , ..).
5 , , , .
3 . .
: -.
, , , .
, :
, 19 30 , 23 25 . .
. , .
- . , .
7.
, . . , . :
: - , - , - ( ).
-? . , , . , , โ . :
?
, , , . , , Agro Hack :) ( , ).
, ? - , ! , .
!
, , , , , , .