ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • Web Dev Bootcamp Day-20(Machine Learning cont.)
    TIL 2022. 5. 13. 21:01

    Logistic Regression

    graphical representation of logistic regression

    • using linear regression for binary classification purposes can result in an inaccurate model
    • logistic regression will output a probability distribution of the binary classificaiton
    • the sigmoid function plays the role of an activation function by converting the inputs to a range between 0 and 1
    • loss function is cross-entropy function
      • Cross-entropy loss increases as the predicted probability diverges from the actual label.
    • inputs: weighted sum of the inputs
    • and then outputs the probability of the outputs

     

    logistic regression hypothesis formula
    sigmoid formula
    limits of sigmoid function


    Multinomial Logistic Regression Modeling

     

    • Encoding: representing different output classes to indices
      • ex) A = [1, 0, 0, 0, 0], B = [0, 1, 0, 0 ,0]
    • Softmax function: scales the outputs of linear regression to such that the outputs add up to 1
    • Loss function: cross-entropy measures the diverence between predicted probability and actual label

    Logistic Regression Modeling

    Preprocess Data

    • isolate the columns relevant for our analysis
      • df = pd.read_csv('file_name.scv', usecols = ['relevant column 1' , 'relevant column 2'])
    • delete rows/objects that contains null value for our relevant column
      • df = df.dropna()

     

    Set x and y data from DataFrame

    • for binary classification, assign one column w/ binary output to y_data and the rest to x_data

     

    Standardize Data

    •  use scikit package to convert all x data to z-scores(x - u / s)
      • scaler = StandardScaler()
      • x_data_scaled = scaler.fit_transform(x_data)
     

    Split Data for Training Set / Validation Set / Testing Set

    • 4:1 ratio for training and testing set

    Keras Logistic Regression Modeling

    • use sigmoid as activation function
    • use binary crossentropy as loss function 
    • use the accuracy metric in addition to the loss metric
     
Designed by Tistory.