Juni_DEV

Cross validation 종류 본문

Artificial Intelligence

Cross validation 종류

junni :p 2019. 6. 18. 15:21
반응형

K-fold Cross-Validation


 

Cross Validation 종류

 

1. K-fold Cross-validation

    • 데이터셋을 K개의 sub-set으로 분리하는 방법
    • 분리된 K개의 sub-set 하나만 제외한 K-1개의 sub-sets training set으로 이용하여 K개의 모델 추정
    • 일반적으로 K=5, K=10 사용 (-> 논문참고)
    • K 적어질수록 모델의 평가는 편중될  밖에 없음
    • K 높을수록 평가의 bias(편중된 정도) 낮아지지만, 결과의 분산이 높을  있음

K-fold Cross Validation (K=5) 

2. Hold-out Validation

    • Train/Test Split
    • K 1 설정하여 하나의 학습/테스트 Split 만들어 모델을 평가

3. LOOCV (Leave-one-out Cross-validation)

    • fold 하나에 샘플 하나만 들어있는 K 교차 검증
    • K 전체 숫자로 설정하여  관측치가 데이터 세트에서 제외될  있도록 
    • 데이터셋이  때는 시간이 매우 오래 걸리지만, 작은 데이터셋에서는 좋은 결과를 만들어 
    • 장점 : Data set에서 낭비 Data 없음
    • 단점 : 측정  평가 고비용 소요

4. Repeated Random Sub Sampling Validation

    • 임의로 Test set 선택
    • 장점 : 측정  평가 저비용 소요
    • 단점 : 미래 예측  신뢰성 예측 불가

5. Repeated K-fold Cross-validation

    • k-fold cross-validation 과정을 n 반복
    • 반복할 때마다 데이터 샘플을 섞어서 샘플을 다르게 분할

6. Stratified K-fold Cross-validation

    • 층화?
    • 평균 응답 값이 모든 fold에서 대략 동일하도록 선택됨
    •  fold 전체를  대표할  있도록 데이터를 재배열하는 프로세스

7. 중첩된 Cross-validation

    • Cross-validation 내부에 Cross-validation 추가
    • 쪼개고 쪼개기

8. Shuffle-split Cross-validation

    • 임의 분할 교차 검증
    • Stratified-Shuffle-Split  있음

Shuffle-split Cross-validation

 

반응형
Comments