Steps for building a recommendation system in staff training

In recent years, programs have become widespread that try to predict which objects will be of interest to the user, having certain information about his profile. Until 2006, such algorithms were not popular. But that all changed in the fall of 2006 when Netflix offered the developers $ 1,000,000 for the best prediction algorithm. The competition lasted 3 years.





Let's talk today about our experience in building a recommendation system in personnel training.





.





?

– IT- . , . . , , .





. . , .









1. Content-based filtering ( )

, . , .





2.  Collaborative filtering ( )

, .





3. ,

– . , .  





?

  • . Users. , , Users.





  • . ( ). , , . …





  • features ( ) Users.





Users :





  • / ( );





  • / ;





  • ;





  • (, Data Analist, Data Engineer, Data Scientist);





  • ( );





  • ( ).





Users .





MVP , . . Users :





  • (-1, +2);





  • – ;





  • – ;





  • – Data Scientist;





  • – 5 ( 20 65);





  • - 5 .





Users – 3 .





– 6 ( 2 User).





– Python.





(DataSet), , , Users.





User 3 Users .





#    DataSet
for row in df:
    corrMatr = df.corrwith(df[row])  #   
    corrMatr = pd.DataFrame(corrMatr)
    tempMatr = corrMatr  #  
    tempMatr = tempMatr.drop([row], axis=0)
    li = list()
    li2 = list()
    print(row)
    k = 0
    while k < 6:
        if len(tempMatr) == 0:  #    tempMatr  0,    while
            break
        name = tempMatr.idxmax().item()  #        
        dp = df3[df3['Tab'] == name].set_index('Tab')  #      ,
            #     Tab  name
        if name not in li2 and ((df[name]['pos'] <= df[row]['pos'] + 2 and df[name]['pos'] >= df[row]['pos'])):
            #         
            li2.append(name)
            col_dp = dp.columns.tolist()  #    DataFrame
            random.shuffle(col_dp)  #  
            for yy in col_dp:  #   
                if pd.DataFrame(df3[df3['Tab'] == name][yy]).reset_index()[yy][0] == 1 and \
                    pd.DataFrame(df3[df3['Tab'] == row][yy]).reset_index()[yy][0] == 0 and \
                        yy not in li and yy in df777[''].tolist():
                    #         
                    recList.append([row, name, yy,
                                    pd.DataFrame(df4[df4['Tab'] == row]['TB']).reset_index()['TB'][0], \
                                    pd.DataFrame(df4[df4['Tab'] == name]['TB']).reset_index()['TB'][0], \
                                    pd.DataFrame(df4[df4['Tab'] == row]['FIO']).reset_index()['FIO'][0], \
                                    pd.DataFrame(df4[df4['Tab'] == name]['FIO']).reset_index()['FIO'][0]])
                    k += 1
                    li.append(yy)
                    #     tempMatr
                    tempMatr = tempMatr.drop([tempMatr.idxmax().item()], axis=0)
                    break  #    for
        else:  #     tempMatr
            tempMatr = tempMatr.drop([tempMatr.idxmax().item()], axis=0)
#   DataFrame     Excel
recomendations = recomendations.append(recList, ignore_index=True)
recomendations.to_excel('.xlsx')
      
      



.





. :





  • (, );





  • .





.





This recommendation algorithm was implemented in a pilot mode (during one quarter). The created MVP has reached the target conversion rate of 25% set by the management, which allows us to recognize it as successful and ready for implementation in the industry.








All Articles