light_model 추가 및 README 보강

김민수
Commit 70a5847e74206174d6cd9e9efe01c422807a90d3 70a5847e 1 parent 3c341a41
Showing 19 changed files with 1451 additions and 0 deletions
Light_model/.gitignore
Light_model/Dockerfile
Light_model/README.md
Light_model/Styling.py
Light_model/app.js
Light_model/app.py
Light_model/chat.css
Light_model/generation.py
Light_model/light_chatbot.py
Light_model/main.html
Light_model/model.py
Light_model/requirements.txt
Light_model/sorted_model-rough.pth
Light_model/sorted_model-soft.pth
Light_model/static/app.js
Light_model/static/chat.css
Light_model/static/favicon.ico
Light_model/static/main.html
README.md
--- a/Light_model/.gitignore 0 → 100644
View file @70a5847
+++ b/Light_model/.gitignore 0 → 100644
View file @70a5847
+*.zip
+venv/
--- a/Light_model/Dockerfile 0 → 100644
View file @70a5847
+++ b/Light_model/Dockerfile 0 → 100644
View file @70a5847
+FROM ufoym/deepo:pytorch-cpu
+# https://github.com/Beomi/deepo-nlp/blob/master/Dockerfile
+# Install JVM for Konlpy
+RUN apt-get update && \
+    apt-get upgrade -y && \
+    apt-get install -y \
+    openjdk-8-jdk wget curl git python3-dev \
+    language-pack-ko
+
+RUN locale-gen en_US.UTF-8 && \
+    update-locale LANG=en_US.UTF-8
+
+# Install zsh
+RUN apt-get install -y zsh && \
+    sh -c "$(curl -fsSL https://raw.github.com/robbyrussell/oh-my-zsh/master/tools/install.sh)"
+
+# Install another packages
+RUN pip install --upgrade pip
+RUN pip install autopep8
+RUN pip install konlpy
+RUN pip install torchtext pytorch_pretrained_bert
+# Install dependency of styling chatbot
+RUN pip install hgtk chatspace
+
+# Add Mecab-Ko
+RUN curl -L https://raw.githubusercontent.com/konlpy/konlpy/master/scripts/mecab.sh | bash
+# install styling chatbot by BM-K
+RUN git clone https://github.com/km19809/light_model.git
+RUN pip install -r light_model/requirements.txt
+
+# Add non-root user
+RUN adduser --disabled-password --gecos "" user
+
+# Reset Workdir
+WORKDIR /light_model
\ No newline at end of file
--- a/Light_model/README.md 0 → 100644
View file @70a5847
+++ b/Light_model/README.md 0 → 100644
View file @70a5847
+# Light weight model of styling chatbot
+가벼운 모델을 웹호스팅하기 위한 레포지토리입니다.\
+원본 레포지토리는 다음과 같습니다.  [바로 가기](https://github.com/km19809/Styling-Chatbot-with-Transformer)
+
+## 요구사항
+
+이하의 내용은 개발 중 변경될 수 있으니 requirements.txt를 참고 바랍니다.
+```
+torch~=1.4.0
+Flask~=1.1.2
+torchtext~=0.6.0
+hgtk~=0.1.3
+konlpy~=0.5.2
+chatspace~=1.0.1
+```
+
+## 사용법
+`light_chatbot.py [--train] [--per_soft|--per_rough]`
+
+* train: 학습해 모델을 만들 경우에 사용합니다. \
+사용하지 않으면 모델을 불러와 시험 합니다.
+* per_soft: soft 말투를 학습 또는 시험합니다.\
+per_rough를 쓴 경우 rough 말투를 학습 또는 시험합니다.\
+두 옵션은 양립 불가능합니다.
+
+`app.py`
+
+챗봇을 시험하기 위한 간단한 플라스크 서버입니다.
\ No newline at end of file
--- a/Light_model/Styling.py 0 → 100644
View file @70a5847
+++ b/Light_model/Styling.py 0 → 100644
View file @70a5847
+import torch
+import csv
+import hgtk
+from konlpy.tag import Mecab
+import random
+
+mecab = Mecab()
+empty_list = []
+positive_emo = ['ㅎㅎ', '~']
+negative_emo = ['...', 'ㅠㅠ']
+asdf = []
+
+
+# mecab 을 통한 형태소 분석.
+def mecab_token_pos_flat_fn(string: str):
+    tokens_ko = mecab.pos(string)
+    return [str(pos[0]) + '/' + str(pos[1]) for pos in tokens_ko]
+
+
+# rough 를 위한 함수. 대명사 NP (저, 제) 를 찾아 나 or 내 로 바꿔준다.
+def exchange_NP(target: str):
+    keyword = []
+    ko_sp = mecab_token_pos_flat_fn(target)
+    _idx = -1  # 실패 시 기본 값
+    for idx, word in enumerate(ko_sp):
+        if word.find('NP') > 0:
+            keyword.append(word.split('/'))
+            _idx = idx
+            break
+    if not keyword:  # keyword 가 비었을 때
+        return '', _idx, False
+
+    if keyword[0][0] == '저':
+        keyword[0][0] = '나'
+    elif keyword[0][0] == '제':
+        keyword[0][0] = '내'
+    else:
+        return keyword[0], _idx, False
+
+    return keyword[0][0], _idx, True
+
+
+# 단어를 soft or rough 말투로 바꾸는 과정
+def make_special_word(target: str, per_rough: bool, search_ec: bool):
+    # mecab 를 통해 문장을 구분 (example output : ['오늘/MAG', '날씨/NNG', '좋/VA', '다/EF', './SF'])
+    ko_sp = mecab_token_pos_flat_fn(target)
+
+    keyword = []
+    _idx = -1  # 실패 시 기본 값
+    # word 에 종결어미 'EF' or 'EC' 가 포함 되어 있을 경우 index 와 keyword 추출.
+    for idx, word in enumerate(ko_sp):
+        if word.find('EF') > 0:
+            keyword.append(word.split('/'))
+            _idx = idx
+            break
+        if search_ec:
+            if ko_sp[-2].find('EC') > 0:
+                keyword.append(ko_sp[-2].split('/'))
+                _idx = len(ko_sp) - 1
+                break
+            else:
+                continue
+
+    # 'EF'가 없을 시 return.
+    if not keyword:
+        return '', _idx
+    else:
+        _keyword = keyword[0]
+
+    if per_rough:
+        return _keyword[0], _idx
+
+    # hgtk 를 사용하여 keyword 를 쪼갬. (ex output : 하ᴥ세요)
+    h_separation = hgtk.text.decompose(_keyword[0])
+    total_word = ''
+
+    for idx, word in enumerate(h_separation):
+        total_word += word
+
+    # 'EF' 에 종성 'ㅇ' 를 붙여 Styling
+    total_word = replace_right(total_word, "ᴥ", "ㅇᴥ", 1)
+
+    # 다 이어 붙임. ' 하세요 -> 하세용 ' 으로 변환.
+    h_combine = hgtk.text.compose(total_word)
+
+    return h_combine, _idx
+
+
+# special token 을 만드는 함수
+def make_special_token(per_rough: bool):
+    # 감정을 나타내기 위한 special token
+    target_special_voca = []
+
+    banmal_dict = get_rough_dic()
+
+    # train data set 의 chatbot answer 에서 'EF' 를 뽑아 종성 'ㅇ' 을 붙인 special token 생성
+    with open('chatbot_0325_ALLLABEL_train.txt', 'r', encoding='utf-8') as f:
+        rdr = csv.reader(f, delimiter='\t')
+        for idx, line in enumerate(rdr):
+            target = line[2]  # chatbot answer
+            exchange_word, _ = make_special_word(target, per_rough, False)
+            target_special_voca.append(str(exchange_word))
+    target_special_voca = list(set(target_special_voca))
+
+    banmal_special_voca = []
+    for i in range(len(target_special_voca)):
+        try:
+            banmal_special_voca.append(banmal_dict[target_special_voca[i]])
+        except KeyError:
+            if per_rough:
+                print("not include banmal dictionary")
+            pass
+
+    # 임의 이모티콘 추가.
+    target_special_voca.append('ㅎㅎ')
+    target_special_voca.append('~')
+    target_special_voca.append('ㅠㅠ')
+    target_special_voca.append('...')
+    target_special_voca = target_special_voca + banmal_special_voca
+
+    # '<posi> : positive, <nega> : negative' 를 의미
+    return ['<posi>', '<nega>'], target_special_voca
+
+
+# python string 함수 replace 를 오른쪽부터 시작하는 함수.
+def replace_right(original: str, old: str, new: str, count_right: int):
+    text = original
+
+    count_find = original.count(old)
+    # 바꿀 횟수가 문자열에 포함된 old보다 많다면 문자열에 포함된 old의 모든 개수(count_find)만큼 교체한다 아니라면 입력받은 개수(count)만큼 교체한다
+    repeat = count_find if count_right > count_find else count_right
+    for _ in range(repeat):
+        find_index = text.rfind(old)  # 오른쪽부터 index를 찾기위해 rfind 사용
+        text = text[:find_index] + new + text[find_index + 1:]
+
+    return text
+
+
+# transformer 에 input 과 output 으로 들어갈 tensor Styling 변환.
+def styling(enc_input, dec_input, dec_output, dec_outputs, enc_label, max_len: int, per_soft: bool, per_rough: bool,  TEXT, LABEL):
+    device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
+
+    pad_tensor = torch.tensor([LABEL.vocab.stoi['<pad>']]).type(dtype=torch.int32).to(device)
+
+    temp_enc = enc_input.data.cpu().numpy()
+    batch_sentiment_list = []
+
+    # 부드러운 성격
+    if per_soft:
+        # encoder input : 나는 너를 좋아해 <posi> <pad> <pad> ... - 형식으로 바꿔줌.
+        for i in range(len(temp_enc)):
+            for j in range(max_len):
+                if temp_enc[i][j] == 1 and enc_label[i] == 0:
+                    temp_enc[i][j] = TEXT.vocab.stoi["<nega>"]
+                    batch_sentiment_list.append(0)
+                    break
+                elif temp_enc[i][j] == 1 and enc_label[i] == 1:
+                    temp_enc[i][j] = TEXT.vocab.stoi["<posi>"]
+                    batch_sentiment_list.append(1)
+                    break
+
+        enc_input = torch.tensor(temp_enc, dtype=torch.int32).to(device)
+
+        for i in range(len(dec_outputs)):
+            dec_outputs[i] = torch.cat([dec_output[i], pad_tensor], dim=-1)
+
+        temp_dec = dec_outputs.data.cpu().numpy()
+
+        dec_outputs_sentiment_list = []  # decoder 에 들어가 감정표현 저장.
+
+        # decoder outputs : 저도 좋아용 ㅎㅎ <eos> <pad> <pad> ... - 형식으로 바꿔줌.
+        for i in range(len(temp_dec)):  # i = batch size
+            temp_sentence = ''
+            sa_ = batch_sentiment_list[i]
+            if sa_ == 0:
+                sa_ = random.choice(negative_emo)
+            elif sa_ == 1:
+                sa_ = random.choice(positive_emo)
+            dec_outputs_sentiment_list.append(sa_)
+
+            for ix, token_i in enumerate(temp_dec[i]):
+                if LABEL.vocab.itos[token_i] in ['<sos>', '<eos>', '<pad>']:
+                    continue
+                temp_sentence = temp_sentence + LABEL.vocab.itos[token_i]
+            temp_sentence = temp_sentence + '.'  # 마침표에 유무에 따라 형태소 분석이 달라짐.
+            exchange_word, idx = make_special_word(temp_sentence, per_rough, True)
+
+            if exchange_word == '':
+                for j in range(len(temp_dec[i])):
+                    if temp_dec[i][j] == LABEL.vocab.stoi['<eos>']:
+                        temp_dec[i][j] = LABEL.vocab.stoi[sa_]
+                        temp_dec[i][j + 1] = LABEL.vocab.stoi['<eos>']
+                        break
+                continue
+
+            for j in range(len(temp_dec[i])):
+                if LABEL.vocab.itos[temp_dec[i][j]] == '<eos>':
+                    temp_dec[i][j - 1] = LABEL.vocab.stoi[exchange_word]
+                    temp_dec[i][j] = LABEL.vocab.stoi[dec_outputs_sentiment_list[i]]
+                    temp_dec[i][j + 1] = LABEL.vocab.stoi['<eos>']
+                    break
+                elif temp_dec[i][j] != LABEL.vocab.stoi['<eos>'] and j + 1 == len(temp_dec[i]):
+                    print("\t-ERROR- No <EOS> token")
+                    exit()
+
+        dec_outputs = torch.tensor(temp_dec, dtype=torch.int32).to(device)
+
+        temp_dec_input = dec_input.data.cpu().numpy()
+        # decoder input : <sos> 저도 좋아용 ㅎㅎ <eos> <pad> <pad> ... - 형식으로 바꿔줌.
+        for i in range(len(temp_dec_input)):
+            temp_sentence = ''
+            for ix, token_i in enumerate(temp_dec_input[i]):
+                if LABEL.vocab.itos[token_i] in ['<sos>', '<eos>', '<pad>']:
+                    continue
+                temp_sentence = temp_sentence + LABEL.vocab.itos[token_i]
+            temp_sentence = temp_sentence + '.'  # 마침표에 유무에 따라 형태소 분석이 달라짐.
+            exchange_word, idx = make_special_word(temp_sentence, per_rough, True)
+
+            if exchange_word == '':
+                for j in range(len(temp_dec_input[i])):
+                    if temp_dec_input[i][j] == LABEL.vocab.stoi['<eos>']:
+                        temp_dec_input[i][j] = LABEL.vocab.stoi[dec_outputs_sentiment_list[i]]
+                        temp_dec_input[i][j + 1] = LABEL.vocab.stoi['<eos>']
+                        break
+                continue
+
+            for j in range(len(temp_dec_input[i])):
+                if LABEL.vocab.itos[temp_dec_input[i][j]] == '<eos>':
+                    temp_dec_input[i][j - 1] = LABEL.vocab.stoi[exchange_word]
+                    temp_dec_input[i][j] = LABEL.vocab.stoi[dec_outputs_sentiment_list[i]]
+                    temp_dec_input[i][j + 1] = LABEL.vocab.stoi['<eos>']
+                    break
+                elif temp_dec_input[i][j] != LABEL.vocab.stoi['<eos>'] and j + 1 == len(temp_dec_input[i]):
+                    print("\t-ERROR- No <EOS> token")
+                    exit()
+
+        dec_input = torch.tensor(temp_dec_input, dtype=torch.int32).to(device)
+
+    # 거친 성격
+    elif per_rough:
+        banmal_dic = get_rough_dic()
+
+        for i in range(len(dec_outputs)):
+            dec_outputs[i] = torch.cat([dec_output[i], pad_tensor], dim=-1)
+
+        temp_dec = dec_outputs.data.cpu().numpy()
+
+        # decoder outputs : 나도 좋아  <eos> <pad> <pad> ... - 형식으로 바꿔줌.
+        for i in range(len(temp_dec)):  # i = batch size
+            temp_sentence = ''
+            for ix, token_i in enumerate(temp_dec[i]):
+                if LABEL.vocab.itos[token_i] == '<eos>':
+                    break
+                temp_sentence = temp_sentence + LABEL.vocab.itos[token_i]
+            temp_sentence = temp_sentence + '.'  # 마침표에 유무에 따라 형태소 분석이 달라짐.
+            exchange_word, idx = make_special_word(temp_sentence, per_rough, True)
+            exchange_NP_word, NP_idx, exist = exchange_NP(temp_sentence)
+
+            if exist:
+                temp_dec[i][NP_idx] = LABEL.vocab.stoi[exchange_NP_word]
+
+            if exchange_word == '':
+                continue
+            try:
+                exchange_word = banmal_dic[exchange_word]
+            except KeyError:
+                asdf.append(exchange_word)
+                print("not include banmal dictionary")
+                pass
+
+            temp_dec[i][idx] = LABEL.vocab.stoi[exchange_word]
+            temp_dec[i][idx + 1] = LABEL.vocab.stoi['<eos>']
+            for k in range(idx + 2, max_len):
+                temp_dec[i][k] = LABEL.vocab.stoi['<pad>']
+
+            # for j in range(len(temp_dec[i])):
+            #     if LABEL.vocab.itos[temp_dec[i][j]]=='<eos>':
+            #         break
+            #     print(LABEL.vocab.itos[temp_dec[i][j]], end='')
+            # print()
+
+        dec_outputs = torch.tensor(temp_dec, dtype=torch.int32).to(device)
+
+        temp_dec_input = dec_input.data.cpu().numpy()
+        # decoder input : <sos> 나도 좋아 <eos> <pad> <pad> ... - 형식으로 바꿔줌.
+        for i in range(len(temp_dec_input)):
+            temp_sentence = ''
+            for ix, token_i in enumerate(temp_dec_input[i]):
+                if ix == 0:
+                    continue  # because of token <sos>
+                if LABEL.vocab.itos[token_i] == '<eos>':
+                    break
+                temp_sentence = temp_sentence + LABEL.vocab.itos[token_i]
+            temp_sentence = temp_sentence + '.'  # 마침표에 유무에 따라 형태소 분석이 달라짐.
+            exchange_word, idx = make_special_word(temp_sentence, per_rough, True)
+            exchange_NP_word, NP_idx, exist = exchange_NP(temp_sentence)
+            idx = idx + 1  # because of token <sos>
+            NP_idx = NP_idx + 1
+
+            if exist:
+                temp_dec_input[i][NP_idx] = LABEL.vocab.stoi[exchange_NP_word]
+
+            if exchange_word == '':
+                continue
+
+            try:
+                exchange_word = banmal_dic[exchange_word]
+            except KeyError:
+                print("not include banmal dictionary")
+                pass
+
+            temp_dec_input[i][idx] = LABEL.vocab.stoi[exchange_word]
+            temp_dec_input[i][idx + 1] = LABEL.vocab.stoi['<eos>']
+
+            for k in range(idx + 2, max_len):
+                temp_dec_input[i][k] = LABEL.vocab.stoi['<pad>']
+
+            # for j in range(len(temp_dec_input[i])):
+            #     if LABEL.vocab.itos[temp_dec_input[i][j]]=='<eos>':
+            #         break
+            #     print(LABEL.vocab.itos[temp_dec_input[i][j]], end='')
+            # print()
+
+        dec_input = torch.tensor(temp_dec_input, dtype=torch.int32).to(device)
+
+    return enc_input, dec_input, dec_outputs
+
+
+# 반말로 바꾸기위한 딕셔너리
+def get_rough_dic():
+    my_exword = {
+        '돌아와요': '돌아와',
+        '으세요': '으셈',
+        '잊어버려요': '잊어버려',
+        '나온대요': '나온대',
+        '될까요': '될까',
+        '할텐데': '할텐데',
+        '옵니다': '온다',
+        '봅니다': '본다',
+        '네요': '네',
+        '된답니다': '된대',
+        '데요': '데',
+        '봐요': '봐',
+        '부러워요': '부러워',
+        '바랄게요': '바랄게',
+        '지나갑니다': "지가간다",
+        '이뻐요': "이뻐",
+        '지요': "지",
+        '사세요': "사라",
+        '던가요': "던가",
+        '모릅니다': "몰라",
+        '은가요': "은가",
+        '심해요': "심해",
+        '몰라요': "몰라",
+        '라요': "라",
+        '더라고요': '더라고',
+        '입니다': '이라고',
+        '는다면요': '는다면',
+        '멋져요': '멋져',
+        '다면요': '다면',
+        '다니': '다나',
+        '져요': '져',
+        '만드세요': '만들어',
+        '야죠': '야지',
+        '죠': '지',
+        '해줄게요': '해줄게',
+        '대요': '대',
+        '돌아갑시다': '돌아가자',
+        '해보여요': '해봐',
+        '라뇨': '라니',
+        '편합니다': '편해',
+        '합시다': '하자',
+        '드세요': '먹어',
+        '아름다워요': '아름답네',
+        '드립니다': '줄게',
+        '받아들여요': '받아들여',
+        '건가요': '간기',
+        '쏟아진다': '쏟아지네',
+        '슬퍼요': '슬퍼',
+        '해서요': '해서',
+        '다릅니다': '다르다',
+        '니다': '니',
+        '내려요': '내려',
+        '마셔요': '마셔',
+        '아세요': '아냐',
+        '변해요': '뱐헤',
+        '드려요': '드려',
+        '아요': '아',
+        '어서요': '어서',
+        '뜁니다': '뛴다',
+        '속상해요': '속상해',
+        '래요': '래',
+        '까요': '까',
+        '어야죠': '어야지',
+        '라니': '라니',
+        '해집니다': '해진다',
+        '으련만': '으련만',
+        '지워져요': '지워져',
+        '잘라요': '잘라',
+        '고요': '고',
+        '셔야죠': '셔야지',
+        '다쳐요': '다쳐',
+        '는구나': '는구만',
+        '은데요': '은데',
+        '일까요': '일까',
+        '인가요': '인가',
+        '아닐까요': '아닐까',
+        '텐데요': '텐데',
+        '할게요': '할게',
+        '보입니다': '보이네',
+        '에요': '야',
+        '걸요': '걸',
+        '한답니다': '한대',
+        '을까요': '을까',
+        '못해요': '못해',
+        '베푸세요': '베풀어',
+        '어때요': '어떄',
+        '더라구요': '더라구',
+        '노라': '노라',
+        '반가워요': '반가워',
+        '군요': '군',
+        '만납시다': '만나자',
+        '어떠세요': '어때',
+        '달라져요': '달라져',
+        '예뻐요': '예뻐',
+        '됩니다': '된다',
+        '봅시다': '보자',
+        '한대요': '한대',
+        '싸워요': '싸워',
+        '와요': '와',
+        '인데요': '인데',
+        '야': '야',
+        '줄게요': '줄게',
+        '기에요': '기',
+        '던데요': '던데',
+        '걸까요': '걸까',
+        '신가요': '신가',
+        '어요': '어',
+        '따져요': '따져',
+        '갈게요': '갈게',
+        '봐': '봐',
+        '나요': '나',
+        '니까요': '니까',
+        '마요': '마',
+        '씁니다': '쓴다',
+        '집니다': '진다',
+        '건데요': '건데',
+        '지웁시다': '지우자',
+        '바랍니다': '바래',
+        '는데요': '는데',
+        '으니까요': '으니까',
+        '셔요': '셔',
+        '네여': '네',
+        '달라요': '달라',
+        '거려요': '거려',
+        '보여요': '보여',
+        '겁니다': '껄',
+        '다': '다',
+        '그래요': '그래',
+        '한가요': '한가',
+        '잖아요': '잖아',
+        '한데요': '한데',
+        '우세요': '우셈',
+        '해야죠': '해야지',
+        '세요': '셈',
+        '걸려요': '걸려',
+        '텐데': '텐데',
+        '어딘가': '어딘가',
+        '요': '',
+        '흘러갑니다': '흘러간다',
+        '줘요': '줘',
+        '편해요': '편해',
+        '거예요': '거야',
+        '예요': '야',
+        '습니다': '어',
+        '아닌가요': '아닌가',
+        '합니다': '한다',
+        '사라집니다': '사라져',
+        '드릴게요': '줄게',
+        '다면': '다면',
+        '그럴까요': '그럴까',
+        '해요': '해',
+        '답니다': '다',
+        '주무세요': '자라',
+        '마세요': '마라',
+        '아픈가요': '아프냐',
+        '그런가요': '그런가',
+        '했잖아요': '했잖아',
+        '버려요': '버려',
+        '갑니다': '간다',
+        '가요': '가',
+        '라면요': '라면',
+        '아야죠': '아야지',
+        '살펴봐요': '살펴봐',
+        '남겨요': '남겨',
+        '내려놔요': '내려놔',
+        '떨려요': '떨려',
+        '랍니다': '란다',
+        '돼요': '돼',
+        '버텨요': '버텨',
+        '만나': '만나',
+        '일러요': '일러',
+        '을게요': '을게',
+        '갑시다': '가자',
+        '나아요': '나아',
+        '어려요': '어려',
+        '온대요': '온대',
+        '다고요': '다고',
+        '할래요': '할래',
+        '된대요': '된대',
+        '어울려요': '어울려',
+        '는군요': '는군',
+        '볼까요': '볼까',
+        '드릴까요': '줄까',
+        '라던데요': '라던데',
+        '올게요': '올게',
+        '기뻐요': '기뻐',
+        '아닙니다': '아냐',
+        '둬요': '둬',
+        '십니다': '십',
+        '아파요': '아파',
+        '생겨요': '생겨',
+        '해줘요': '해줘',
+        '로군요': '로군요',
+        '시켜요': '시켜',
+        '느껴져요': '느껴져',
+        '가재요': '가재',
+        '어 ': ' ',
+        '느려요': '느려',
+        '볼게요': '볼게',
+        '쉬워요': '쉬워',
+        '나빠요': '나빠',
+        '불러줄게요': '불러줄게',
+        '살쪄요': '살쪄',
+        '봐야겠어요': '봐야겠어',
+        '네': '네',
+        '어': '어',
+        '든지요': '든지',
+        '드신다': '드심',
+        '가져요': '가져',
+        '할까요': '할까',
+        '졸려요': '졸려',
+        '그럴게요': '그럴게',
+        '': '',
+        '어린가': '어린가',
+        '나와요': '나와',
+        '빨라요': '빨라',
+        '겠죠': '겠지',
+        '졌어요': '졌어',
+        '해봐요': '해봐',
+        '게요': '게',
+        '해드릴까요': '해줄까',
+        '인걸요': '인걸',
+        '했어요': '했어',
+        '원해요': '원해',
+        '는걸요': '는걸',
+        '좋아합니다': '좋아해',
+        '했으면': '했으면',
+        '나갑니다': '나간다',
+        '왔어요': '왔어',
+        '해봅시다': '해보자',
+        '물어봐요': '물어봐',
+        '생겼어요': '생겼어',
+        '해': '해',
+        '다녀올게요': '다녀올게',
+        '납시다': '나자'
+    }
+    return my_exword
\ No newline at end of file
--- a/Light_model/app.js 0 → 100644
View file @70a5847
+++ b/Light_model/app.js 0 → 100644
View file @70a5847
+function send() {
+    /*client side */
+  var chat = document.createElement("li");
+  var chat_input = document.getElementById("chat_input");
+  var chat_text = chat_input.value;
+  chat.className = "chat-bubble mine";
+  chat.innerText = chat_text
+  document.getElementById("chat_list").appendChild(chat);
+  chat_input.value = "";
+
+  /* ajax request */
+  var request = new XMLHttpRequest();
+  request.open("POST", `${window.location.host}/api/soft`, true);
+  request.onreadystatechange = function() {
+    if (request.readyState !== 4 || Math.floor(request.status /100) !==2) return;
+    var bot_chat = document.createElement("li");
+  bot_chat.className = "chat-bubble bots";
+  bot_chat.innerText = JSON.parse(request.responseText).data;
+  document.getElementById("chat_list").appendChild(bot_chat);
+
+  };
+  request.setRequestHeader("Content-Type", "application/json;charset=UTF-8");
+request.send(JSON.stringify({"data":chat_text}));
+}
+
+function setDefault() {
+  document.getElementById("chat_input").addEventListener("keyup", function(event) {
+    let input = document.getElementById("chat_input").value;
+    let button = document.getElementById("send_button");
+    if(input.length>0)
+    {
+      button.removeAttribute("disabled");
+    }
+    else
+    {
+      button.setAttribute("disabled", "true");
+    }
+    // Number 13 is the "Enter" key on the keyboard
+    if (event.keyCode === 13) {
+      // Cancel the default action, if needed
+      event.preventDefault();
+      // Trigger the button element with a click
+      button.click();
+    }
+  });
+}
--- a/Light_model/app.py 0 → 100644
View file @70a5847
+++ b/Light_model/app.py 0 → 100644
View file @70a5847
+from flask import Flask, request, jsonify, send_from_directory
+import torch
+from torchtext import data
+from generation import inference, tokenizer1
+from Styling import make_special_token
+from model import Transformer
+
+app = Flask(__name__,
+            static_url_path='', 
+            static_folder='static',)
+app.config['JSON_AS_ASCII'] = False
+device = torch.device('cpu')
+max_len = 40
+ID = data.Field(sequential=False,
+                use_vocab=False)
+SA = data.Field(sequential=False,
+                use_vocab=False)
+TEXT = data.Field(sequential=True,
+                  use_vocab=True,
+                  tokenize=tokenizer1,
+                  batch_first=True,
+                  fix_length=max_len,
+                  dtype=torch.int32
+                  )
+
+LABEL = data.Field(sequential=True,
+                   use_vocab=True,
+                   tokenize=tokenizer1,
+                   batch_first=True,
+                   fix_length=max_len,
+                   init_token='<sos>',
+                   eos_token='<eos>',
+                   dtype=torch.int32
+                   )
+text_specials, label_specials = make_special_token(False)
+train_data, _ = data.TabularDataset.splits(
+    path='.', train='chatbot_0325_ALLLABEL_train.txt', test='chatbot_0325_ALLLABEL_test.txt', format='tsv',
+    fields=[('id', ID), ('text', TEXT), ('target_text', LABEL), ('SA', SA)], skip_header=True
+)
+TEXT.build_vocab(train_data, max_size=15000, specials=text_specials)
+LABEL.build_vocab(train_data, max_size=15000, specials=label_specials)
+soft_model = Transformer(160, 2, 2, 0.1, TEXT, LABEL)
+# rough_model = Transformer(args, TEXT, LABEL)
+soft_model.to(device)
+# rough_model.to(device)
+soft_model.load_state_dict(torch.load('sorted_model-soft.pth', map_location=device)['model_state_dict'])
+
+
+# rough_model.load_state_dict(torch.load('sorted_model-rough.pth', map_location=device)['model_state_dict'])
+
+
+@app.route('/api/soft', methods=['POST'])
+def soft():
+    if request.is_json:
+        sentence = request.json["data"]
+        return jsonify({"data": inference(device, max_len, TEXT, LABEL, soft_model, sentence)}), 200
+    else:
+        return jsonify({"data": "잘못된 요청입니다. Bad Request."}), 400
+
+# @app.route('/rough', methods=['POST'])
+# def rough():
+#     return inference(device, max_len, TEXT, LABEL, rough_model, ), 200
+
+@app.route('/', methods=['GET'])
+def main_page():
+    return send_from_directory('static','main.html')
+
+if __name__ == '__main__':
+    app.run(host='0.0.0.0', port=8080)
--- a/Light_model/chat.css 0 → 100644
View file @70a5847
+++ b/Light_model/chat.css 0 → 100644
View file @70a5847
+ul.no-bullets {
+    list-style-type: none; /* Remove bullets */
+    padding: 0; /* Remove padding */
+    margin: 0; /* Remove margins */
+  }
+
+.chat-bubble {
+  position: relative;
+  padding: 0.5em;
+  margin-top: 0.25em;
+  margin-bottom: 0.25em;
+  border-radius: 0.4em;
+  color: white;
+}
+.mine {
+  background: #00aabb;
+}
+.bots {
+  background: #cc78c5;
+}
+
+.chat-bubble:after {
+  content: "";
+  position: absolute;
+  top: 50%;
+  width: 0;
+  height: 0;
+  border: 0.625em solid transparent;
+  border-top: 0;
+  margin-top: -0.312em;
+  
+}
+.chat-bubble.mine:after {
+  right: 0;
+
+  border-left-color: #00aabb;
+  border-right: 0;
+  margin-right: -0.625em;
+}
+
+.chat-bubble.bots:after {
+  left: 0;
+
+  border-right-color: #cc78c5;
+  border-left: 0;
+  margin-left: -0.625em;
+}
+
+#chat_input {
+    width: 90%;
+}
+
+#send_button {
+
+    width: 5%;
+    border-radius: 0.4em;
+    color: white;
+    background-color: rgb(15, 145, 138);
+}
+
+.input-holder {
+    position: fixed;
+    left: 0;
+    right: 0;
+    bottom: 0;
+    padding: 0.25em;
+    background-color: lightseagreen;
+}
\ No newline at end of file
--- a/Light_model/generation.py 0 → 100644
View file @70a5847
+++ b/Light_model/generation.py 0 → 100644
View file @70a5847
+import torch
+from konlpy.tag import Mecab
+from torch.autograd import Variable
+from chatspace import ChatSpace
+
+spacer = ChatSpace()
+
+
+def tokenizer1(text: str):
+    result_text = ''.join(c for c in text if c.isalnum())
+    a = Mecab().morphs(result_text)
+    return [a[i] for i in range(len(a))]
+
+
+def inference(device: torch.device, max_len: int, TEXT, LABEL, model: torch.nn.Module, sentence: str):
+
+    enc_input = tokenizer1(sentence)
+    enc_input_index = []
+
+    for tok in enc_input:
+        enc_input_index.append(TEXT.vocab.stoi[tok])
+
+    for j in range(max_len - len(enc_input_index)):
+        enc_input_index.append(TEXT.vocab.stoi['<pad>'])
+
+    enc_input_index = Variable(torch.LongTensor([enc_input_index]))
+
+    dec_input = torch.LongTensor([[LABEL.vocab.stoi['<sos>']]])
+
+    model.eval()
+    pred = []
+    for i in range(max_len):
+        y_pred = model(enc_input_index.to(device), dec_input.to(device))
+        y_pred_ids = y_pred.max(dim=-1)[1]
+        if y_pred_ids[0, -1] == LABEL.vocab.stoi['<eos>']:
+            y_pred_ids = y_pred_ids.squeeze(0)
+            print(">", end=" ")
+            for idx in range(len(y_pred_ids)):
+                if LABEL.vocab.itos[y_pred_ids[idx]] == '<eos>':
+                    pred_sentence = "".join(pred)
+                    pred_str = spacer.space(pred_sentence)
+                    return pred_str
+                else:
+                    pred.append(LABEL.vocab.itos[y_pred_ids[idx]])
+            return 'Error: Sentence is not end'
+
+        dec_input = torch.cat(
+            [dec_input.to(torch.device('cpu')),
+             y_pred_ids[0, -1].unsqueeze(0).unsqueeze(0).to(torch.device('cpu'))], dim=-1)
+    return 'Error: Sentence is not predicted'
--- a/Light_model/light_chatbot.py 0 → 100644
View file @70a5847
+++ b/Light_model/light_chatbot.py 0 → 100644
View file @70a5847
+import argparse
+import time
+import torch
+from torch import nn
+from torchtext import data
+from torchtext.data import BucketIterator
+from torchtext.data import TabularDataset
+
+from Styling import styling, make_special_token
+from generation import inference, tokenizer1
+from model import Transformer, GradualWarmupScheduler
+
+SEED = 1234
+
+
+
+
+def acc(yhat: torch.Tensor, y: torch.Tensor):
+    with torch.no_grad():
+        yhat = yhat.max(dim=-1)[1]  # [0]: max value, [1]: index of max value
+        _acc = (yhat == y).float()[y != 1].mean()  # padding은 acc에서 제거
+    return _acc
+
+
+def train(model: Transformer, iterator, optimizer, criterion: nn.CrossEntropyLoss, max_len: int, per_soft: bool, per_rough: bool):
+    total_loss = 0
+    iter_num = 0
+    tr_acc = 0
+    model.train()
+
+    for step, batch in enumerate(iterator):
+        optimizer.zero_grad()
+
+        enc_input, dec_input, enc_label = batch.text, batch.target_text, batch.SA
+        dec_output = dec_input[:, 1:]
+        dec_outputs = torch.zeros(dec_output.size(0), max_len).type_as(dec_input.data)
+
+        # emotion 과 체를 반영
+        enc_input, dec_input, dec_outputs = \
+            styling(enc_input, dec_input, dec_output, dec_outputs, enc_label, max_len, per_soft, per_rough, TEXT, LABEL)
+
+        y_pred = model(enc_input, dec_input)
+
+        y_pred = y_pred.reshape(-1, y_pred.size(-1))
+        dec_output = dec_outputs.view(-1).long()
+
+        # padding 제외한 value index 추출
+        real_value_index = [dec_output != 1]  # <pad> == 1
+
+        # padding 은 loss 계산시 제외
+        loss = criterion(y_pred[real_value_index], dec_output[real_value_index])
+        loss.backward()
+        optimizer.step()
+
+        with torch.no_grad():
+            train_acc = acc(y_pred, dec_output)
+
+        total_loss += loss
+        iter_num += 1
+        tr_acc += train_acc
+
+    return total_loss.data.cpu().numpy() / iter_num, tr_acc.data.cpu().numpy() / iter_num
+
+
+def test(model: Transformer, iterator, criterion: nn.CrossEntropyLoss):
+    total_loss = 0
+    iter_num = 0
+    te_acc = 0
+    model.eval()
+
+    with torch.no_grad():
+        for batch in iterator:
+            enc_input, dec_input, enc_label = batch.text, batch.target_text, batch.SA
+            dec_output = dec_input[:, 1:]
+            dec_outputs = torch.zeros(dec_output.size(0), args.max_len).type_as(dec_input.data)
+
+            # emotion 과 체를 반영
+            enc_input, dec_input, dec_outputs = \
+                styling(enc_input, dec_input, dec_output, dec_outputs, enc_label, args.max_len, args.per_soft, args.per_rough, TEXT, LABEL)
+
+            y_pred = model(enc_input, dec_input)
+
+            y_pred = y_pred.reshape(-1, y_pred.size(-1))
+            dec_output = dec_outputs.view(-1).long()
+
+            real_value_index = [dec_output != 1]  # <pad> == 1
+
+            loss = criterion(y_pred[real_value_index], dec_output[real_value_index])
+
+            with torch.no_grad():
+                test_acc = acc(y_pred, dec_output)
+            total_loss += loss
+            iter_num += 1
+            te_acc += test_acc
+
+    return total_loss.data.cpu().numpy() / iter_num, te_acc.data.cpu().numpy() / iter_num
+
+
+# 데이터 전처리 및 loader return
+def data_preprocessing(args, device):
+    # ID는 사용하지 않음. SA는 Sentiment Analysis 라벨(0,1) 임.
+    ID = data.Field(sequential=False,
+                    use_vocab=False)
+
+    TEXT = data.Field(sequential=True,
+                      use_vocab=True,
+                      tokenize=tokenizer1,
+                      batch_first=True,
+                      fix_length=args.max_len,
+                      dtype=torch.int32
+                      )
+
+    LABEL = data.Field(sequential=True,
+                       use_vocab=True,
+                       tokenize=tokenizer1,
+                       batch_first=True,
+                       fix_length=args.max_len,
+                       init_token='<sos>',
+                       eos_token='<eos>',
+                       dtype=torch.int32
+                       )
+
+    SA = data.Field(sequential=False,
+                    use_vocab=False)
+
+    train_data, test_data = TabularDataset.splits(
+        path='.', train='chatbot_0325_ALLLABEL_train.txt', test='chatbot_0325_ALLLABEL_test.txt', format='tsv',
+        fields=[('id', ID), ('text', TEXT), ('target_text', LABEL), ('SA', SA)], skip_header=True
+    )
+
+    # TEXT, LABEL 에 필요한 special token 만듦.
+    text_specials, label_specials = make_special_token(args.per_rough)
+
+    TEXT.build_vocab(train_data, max_size=15000, specials=text_specials)
+    LABEL.build_vocab(train_data, max_size=15000, specials=label_specials)
+
+    train_loader = BucketIterator(dataset=train_data, batch_size=args.batch_size, device=device, shuffle=True)
+    test_loader = BucketIterator(dataset=test_data, batch_size=args.batch_size, device=device, shuffle=True)
+
+    return TEXT, LABEL, train_loader, test_loader
+
+
+def main(TEXT, LABEL, arguments):
+
+    # print argparse
+    for idx, (key, value) in enumerate(args.__dict__.items()):
+        if idx == 0:
+            print("\nargparse{\n", "\t", key, ":", value)
+        elif idx == len(args.__dict__) - 1:
+            print("\t", key, ":", value, "\n}")
+        else:
+            print("\t", key, ":", value)
+
+    model = Transformer(args.embedding_dim, args.nhead, args.nlayers, args.dropout, TEXT, LABEL)
+    criterion = nn.CrossEntropyLoss(ignore_index=LABEL.vocab.stoi['<pad>'])
+    optimizer = torch.optim.Adam(params=model.parameters(), lr=arguments.lr)
+    scheduler = GradualWarmupScheduler(optimizer, multiplier=8, total_epoch=arguments.num_epochs)
+    if args.per_soft:
+        sorted_path = 'sorted_model-soft.pth'
+    else:
+        sorted_path = 'sorted_model-rough.pth'
+    model.to(device)
+    if arguments.train:
+        best_valid_loss = float('inf')
+        for epoch in range(arguments.num_epochs):
+            torch.manual_seed(SEED)
+            start_time = time.time()
+
+            # train, validation
+            train_loss, train_acc = \
+                train(model, train_loader, optimizer, criterion, arguments.max_len, arguments.per_soft,
+                      arguments.per_rough)
+            valid_loss, valid_acc = test(model, test_loader, criterion)
+
+            scheduler.step(epoch)
+            # time cal
+            end_time = time.time()
+            elapsed_time = end_time - start_time
+            epoch_mins = int(elapsed_time / 60)
+            epoch_secs = int(elapsed_time - (epoch_mins * 60))
+
+            # torch.save(model.state_dict(), sorted_path) # for some overfitting
+            # 전에 학습된 loss 보다 현재 loss 가 더 낮을시 모델 저장.
+            if valid_loss < best_valid_loss:
+                best_valid_loss = valid_loss
+                torch.save({
+                    'epoch': epoch,
+                    'model_state_dict': model.state_dict(),
+                    'optimizer_state_dict': optimizer.state_dict(),
+                    'loss': valid_loss},
+                    sorted_path)
+                print(f'\t## SAVE valid_loss: {valid_loss:.3f} | valid_acc: {valid_acc:.3f} ##')
+
+            # print loss and acc
+            print(f'\n\t==Epoch: {epoch + 1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s==')
+            print(f'\t==Train Loss: {train_loss:.3f} | Train_acc: {train_acc:.3f}==')
+            print(f'\t==Valid Loss: {valid_loss:.3f} | Valid_acc: {valid_acc:.3f}==\n')
+
+
+
+    checkpoint = torch.load(sorted_path, map_location=device)
+    model.load_state_dict(checkpoint['model_state_dict'])
+
+    test_loss, test_acc = test(model, test_loader, criterion)  # 아
+    print(f'==test_loss : {test_loss:.3f} | test_acc: {test_acc:.3f}==')
+    print("\t-----------------------------")
+    while True:
+        sentence = input("문장을 입력하세요 : ")
+        print(inference(device, args.max_len, TEXT, LABEL, model, sentence))
+        print("\n")
+
+
+if __name__ == '__main__':
+    # argparse 정의
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--max_len', type=int, default=40)  # max_len 크게 해야 오류 안 생김.
+    parser.add_argument('--batch_size', type=int, default=256)
+    parser.add_argument('--num_epochs', type=int, default=22)
+    parser.add_argument('--warming_up_epochs', type=int, default=5)
+    parser.add_argument('--lr', type=float, default=0.0002)
+    parser.add_argument('--embedding_dim', type=int, default=160)
+    parser.add_argument('--nlayers', type=int, default=2)
+    parser.add_argument('--nhead', type=int, default=2)
+    parser.add_argument('--dropout', type=float, default=0.1)
+    parser.add_argument('--train', action="store_true")
+    group = parser.add_mutually_exclusive_group()
+    group.add_argument('--per_soft', action="store_true")
+    group.add_argument('--per_rough', action="store_true")
+    args = parser.parse_args()
+    print("-준비중-")
+    device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
+    TEXT, LABEL, train_loader, test_loader = data_preprocessing(args, device)
+    main(TEXT, LABEL, args)
--- a/Light_model/main.html 0 → 100644
View file @70a5847
+++ b/Light_model/main.html 0 → 100644
View file @70a5847
+<!DOCTYPE html>
+<html>
+    <head>
+        <meta charset="UTF-8">
+         <meta name="viewport" content="width=device-width, initial-scale=1">
+        <title>Emotional Chatbot with Styler</title>
+        <script src="app.js"></script>
+        <link rel="stylesheet" type="text/css" href="chat.css" />
+    </head>
+    <body onload="setDefault()">
+        <ul id="chat_list" class="list no-bullets">
+<li class="chat-bubble mine">(대충 적당한 대사)</li>
+<li class="chat-bubble bots">(대충 알맞은 답변)</li>
+        </ul>
+        <div class="input-holder">
+        <input type="text" id="chat_input" autofocus/>
+        <input type="button" id="send_button" class="button" value="↵" onclick="send()" disabled>
+    </div>
+    </body>
+</html>
\ No newline at end of file
--- a/Light_model/model.py 0 → 100644
View file @70a5847
+++ b/Light_model/model.py 0 → 100644
View file @70a5847
+import torch
+import torch.nn as nn
+import math
+
+device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
+
+
+class Transformer(nn.Module):
+    def __init__(self, embedding_dim: int, nhead: int, nlayers: int, dropout: float, SRC_vocab, TRG_vocab):
+        super(Transformer, self).__init__()
+        self.d_model = embedding_dim
+        self.n_head = nhead
+        self.num_encoder_layers = nlayers
+        self.num_decoder_layers = nlayers
+        self.dim_feedforward = embedding_dim
+        self.dropout = dropout
+
+        self.SRC_vo = SRC_vocab
+        self.TRG_vo = TRG_vocab
+
+        self.pos_encoder = PositionalEncoding(self.d_model, self.dropout)
+
+        self.src_embedding = nn.Embedding(len(self.SRC_vo.vocab), self.d_model)
+        self.trg_embedding = nn.Embedding(len(self.TRG_vo.vocab), self.d_model)
+
+        self.transformer = nn.Transformer(d_model=self.d_model,
+                                                nhead=self.n_head,
+                                                num_encoder_layers=self.num_encoder_layers,
+                                                num_decoder_layers=self.num_decoder_layers,
+                                                dim_feedforward=self.dim_feedforward,
+                                                dropout=self.dropout)
+        self.proj_vocab_layer = nn.Linear(
+            in_features=self.dim_feedforward, out_features=len(self.TRG_vo.vocab))
+
+
+    def forward(self, en_input, de_input):
+        x_en_embed = self.src_embedding(en_input.long()) * math.sqrt(self.d_model)
+        x_de_embed = self.trg_embedding(de_input.long()) * math.sqrt(self.d_model)
+        x_en_embed = self.pos_encoder(x_en_embed)
+        x_de_embed = self.pos_encoder(x_de_embed)
+
+        # Masking
+        src_key_padding_mask = en_input == self.SRC_vo.vocab.stoi['<pad>']
+        tgt_key_padding_mask = de_input == self.TRG_vo.vocab.stoi['<pad>']
+        memory_key_padding_mask = src_key_padding_mask
+        tgt_mask = self.transformer.generate_square_subsequent_mask(de_input.size(1))
+
+        x_en_embed = torch.einsum('ijk->jik', x_en_embed)
+        x_de_embed = torch.einsum('ijk->jik', x_de_embed)
+
+        feature = self.transformer(src=x_en_embed,
+                                   tgt=x_de_embed,
+                                   src_key_padding_mask=src_key_padding_mask,
+                                   tgt_key_padding_mask=tgt_key_padding_mask,
+                                   memory_key_padding_mask=memory_key_padding_mask,
+                                   tgt_mask=tgt_mask.to(device))
+
+        logits = self.proj_vocab_layer(feature)
+        logits = torch.einsum('ijk->jik', logits)
+
+        return logits
+
+
+class PositionalEncoding(nn.Module):
+
+    def __init__(self, d_model, dropout, max_len=15000):
+        super(PositionalEncoding, self).__init__()
+        self.dropout = nn.Dropout(p=dropout)
+
+        pe = torch.zeros(max_len, d_model)
+        position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
+        div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
+        pe[:, 0::2] = torch.sin(position * div_term)
+        pe[:, 1::2] = torch.cos(position * div_term)
+        pe = pe.unsqueeze(0).transpose(0, 1)
+        self.register_buffer('pe', pe)
+
+    def forward(self, x):
+        x = x + self.pe[:x.size(0), :]
+        return self.dropout(x)
+
+
+from torch.optim.lr_scheduler import _LRScheduler
+from torch.optim.lr_scheduler import ReduceLROnPlateau
+
+
+class GradualWarmupScheduler(_LRScheduler):
+    """ Gradually warm-up(increasing) learning rate in optimizer.
+    Proposed in 'Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour'.
+    Args:
+        optimizer (Optimizer): Wrapped optimizer.
+        multiplier: target learning rate = base lr * multiplier
+        total_epoch: target learning rate is reached at total_epoch, gradually
+        after_scheduler: after target_epoch, use this scheduler(eg. ReduceLROnPlateau)
+    """
+
+    def __init__(self, optimizer, multiplier, total_epoch, after_scheduler=None):
+        self.last_epoch =  1  # ReduceLROnPlateau is called at the end of epoch, whereas others are called at beginning
+        self.multiplier = multiplier
+        if self.multiplier <= 1.:
+            raise ValueError('multiplier should be greater than 1.')
+        self.total_epoch = total_epoch
+        self.after_scheduler = after_scheduler
+        self.finished = False
+        super().__init__(optimizer)
+
+    def get_lr(self):
+        if self.last_epoch > self.total_epoch:
+            if self.after_scheduler:
+                if not self.finished:
+                    self.after_scheduler.base_lrs = [base_lr * self.multiplier for base_lr in self.base_lrs]
+                    self.finished = True
+                return self.after_scheduler.get_lr()
+            return [base_lr * self.multiplier for base_lr in self.base_lrs]
+
+        return [base_lr * ((self.multiplier - 1.) * self.last_epoch / self.total_epoch + 1.) for base_lr in
+                self.base_lrs]
+
+    def step_ReduceLROnPlateau(self, metrics, epoch=None):
+        if epoch is None:
+            epoch = self.last_epoch + 1
+        self.last_epoch = epoch if epoch != 0 else 1
+        if self.last_epoch <= self.total_epoch:
+            warmup_lr = [base_lr * ((self.multiplier - 1.) * self.last_epoch / self.total_epoch + 1.) for base_lr in
+                         self.base_lrs]
+            for param_group, lr in zip(self.optimizer.param_groups, warmup_lr):
+                param_group['lr'] = lr
+        else:
+            if epoch is None:
+                self.after_scheduler.step(metrics, None)
+            else:
+                self.after_scheduler.step(metrics, epoch - self.total_epoch)
+
+    def step(self, epoch=None, metrics=None):
+        if type(self.after_scheduler) != ReduceLROnPlateau:
+            if self.finished and self.after_scheduler:
+                if epoch is None:
+                    self.after_scheduler.step(None)
+                else:
+                    self.after_scheduler.step(epoch - self.total_epoch)
+            else:
+                return super(GradualWarmupScheduler, self).step(epoch)
+        else:
+            self.step_ReduceLROnPlateau(metrics, epoch)
--- a/Light_model/requirements.txt 0 → 100644
View file @70a5847
+++ b/Light_model/requirements.txt 0 → 100644
View file @70a5847
+torch~=1.4.0
+Flask~=1.1.2
+torchtext~=0.6.0
+hgtk~=0.1.3
+konlpy~=0.5.2
+chatspace~=1.0.1
\ No newline at end of file
--- a/Light_model/sorted_model-rough.pth 0 → 100644
View file @70a5847
+++ b/Light_model/sorted_model-rough.pth 0 → 100644
View file @70a5847
--- a/Light_model/sorted_model-soft.pth 0 → 100644
View file @70a5847
+++ b/Light_model/sorted_model-soft.pth 0 → 100644
View file @70a5847
--- a/Light_model/static/app.js 0 → 100644
View file @70a5847
+++ b/Light_model/static/app.js 0 → 100644
View file @70a5847
+function send() {
+    /*client side */
+  var chat = document.createElement("li");
+  var chat_input = document.getElementById("chat_input");
+  var chat_text = chat_input.value;
+  chat.className = "chat-bubble mine";
+  chat.innerText = chat_text
+  document.getElementById("chat_list").appendChild(chat);
+  chat_input.value = "";
+
+  /* ajax request */
+  var request = new XMLHttpRequest();
+  request.open("POST", `${window.location.protocol}//${window.location.host}/api/soft`, true);
+  request.onreadystatechange = function() {
+    if (request.readyState !== 4 || Math.floor(request.status /100) !==2) return;
+    var bot_chat = document.createElement("li");
+  bot_chat.className = "chat-bubble bots";
+  bot_chat.innerText = JSON.parse(request.responseText).data;
+  document.getElementById("chat_list").appendChild(bot_chat);
+
+  };
+  request.setRequestHeader("Content-Type", "application/json;charset=UTF-8");
+request.send(JSON.stringify({"data":chat_text}));
+}
+
+function setDefault() {
+  document.getElementById("chat_input").addEventListener("keyup", function(event) {
+    let input = document.getElementById("chat_input").value;
+    let button = document.getElementById("send_button");
+    if(input.length>0)
+    {
+      button.removeAttribute("disabled");
+    }
+    else
+    {
+      button.setAttribute("disabled", "true");
+    }
+    // Number 13 is the "Enter" key on the keyboard
+    if (event.keyCode === 13) {
+      // Cancel the default action, if needed
+      event.preventDefault();
+      // Trigger the button element with a click
+      button.click();
+    }
+  });
+}
--- a/Light_model/static/chat.css 0 → 100644
View file @70a5847
+++ b/Light_model/static/chat.css 0 → 100644
View file @70a5847
+ul.no-bullets {
+    list-style-type: none; /* Remove bullets */
+    padding: 0; /* Remove padding */
+    margin: 0; /* Remove margins */
+  }
+
+.chat-bubble {
+  position: relative;
+  padding: 0.5em;
+  margin-top: 0.25em;
+  margin-bottom: 0.25em;
+  border-radius: 0.4em;
+  color: white;
+}
+.mine {
+  background: #00aabb;
+}
+.bots {
+  background: #cc78c5;
+}
+
+.chat-bubble:after {
+  content: "";
+  position: absolute;
+  top: 50%;
+  width: 0;
+  height: 0;
+  border: 0.625em solid transparent;
+  border-top: 0;
+  margin-top: -0.312em;
+  
+}
+.chat-bubble.mine:after {
+  right: 0;
+
+  border-left-color: #00aabb;
+  border-right: 0;
+  margin-right: -0.625em;
+}
+
+.chat-bubble.bots:after {
+  left: 0;
+
+  border-right-color: #cc78c5;
+  border-left: 0;
+  margin-left: -0.625em;
+}
+
+#chat_input {
+    width: 90%;
+}
+
+#send_button {
+
+    width: 5%;
+    border-radius: 0.4em;
+    color: white;
+    background-color: rgb(15, 145, 138);
+}
+
+.input-holder {
+    position: fixed;
+    left: 0;
+    right: 0;
+    bottom: 0;
+    padding: 0.25em;
+    background-color: lightseagreen;
+}
\ No newline at end of file
--- a/Light_model/static/favicon.ico 0 → 100644
View file @70a5847
+++ b/Light_model/static/favicon.ico 0 → 100644
View file @70a5847
--- a/Light_model/static/main.html 0 → 100644
View file @70a5847
+++ b/Light_model/static/main.html 0 → 100644
View file @70a5847
+<!DOCTYPE html>
+<html>
+    <head>
+        <meta charset="UTF-8">
+         <meta name="viewport" content="width=device-width, initial-scale=1">
+        <title>Emotional Chatbot with Styler</title>
+        <script src="app.js"></script>
+        <link rel="stylesheet" type="text/css" href="chat.css" />
+    </head>
+    <body onload="setDefault()">
+        <ul id="chat_list" class="list no-bullets">
+<li class="chat-bubble mine">이렇게 질문을 하면...</li>
+<li class="chat-bubble bots">이렇게 답변이 옵니다!</li>
+        </ul>
+        <div class="input-holder">
+        <input type="text" id="chat_input" autofocus/>
+        <input type="button" id="send_button" class="button" value="↵" onclick="send()" disabled>
+    </div>
+    </body>
+</html>
\ No newline at end of file
--- a/README.md
View file @70a5847
+++ b/README.md
View file @70a5847
@@ -10,3 +10,51 @@ Language Style과 감정 분석에 따른 챗봇 답변 변화 모델 :
 - Force RTX 2080 Ti
 - Python 3.6.8
 - Pytorch 1.2.0
+
+# Code
+## Chatbot
+
+### Chatbot_main.py
+챗봇 학습 및 시험에 사용되는 메인 파일입니다.
+### model.py
+챗봇에 이용되는 Transfer 모델 클래스 파일입니다.
+### generation.py
+추론 및 Beam search, Greedy search를 하는 파일입니다.
+### metric.py
+학습 성능을 측정하기 위한 모델입니다.\
+`acc(yhat, y)`\
+### Styling.py
+성격에 따라 문체를 바꿔주는 역할을 하는 파일입니다.
+### get_data.py
+데이터셋을 전처리하고 불러오기 위한 파일입니다.\
+`tokenizer1(text)`\
+* text: 토크나이징할 문자열
+특수문자를 걸러낸 후 Mecab으로 토크나이징합니다.\
+`data_preprocessing(args, device)`\
+* args: argparser로 파싱한 NamedTuple
+* device: pytorch device
+텍스트를 토크나이징하고 id, 텍스트, 라벨, 감정분석 결과로 나누어 데이터셋을 구성합니다.
+
+## KoBERT
+[SKTBrain KoBERT](https://github.com/SKTBrain/KoBERT)\
+SKT Brain에서 BERT를 한국어에 응용하여 만든 모델입니다.\
+네이버 영화 리뷰를 통해 감정 분석을 학습했으며 챗봇 감정 분석에 사용됩니다.\
+## Light_model
+웹 호스팅을 위해 경량화한 모델입니다. KoBERT를 지원하지 않습니다.
+### light_chatbot.py
+챗봇 모델 학습 및 시험을 할수 있는 콘솔 프로그램입니다.
+`light_chatbot.py [--train] [--per_soft|--per_rough]`
+
+* train: 학습해 모델을 만들 경우에 사용합니다.
+사용하지 않으면 모델을 불러와 시험 합니다.
+* per_soft: soft 말투를 학습 또는 시험합니다.
+* per_rough: rough 말투를 학습 또는 시험합니다.
+두 옵션은 양립 불가능합니다.
+### app.py
+웹 호스팅을 위한, Flask로 구성된 간단한 HTTP 서버입니다.\
+`POST /api/soft`\
+soft 모델을 사용해, 추론 결과를 JSON으로 응답해주는 API를 제공합니다.\
+`GET /`\
+static 폴더의 HTML, CSS, JS를 정적으로 호스팅해 응답합니다.
+### 기타
+generation.py, styling.py, model.py의 역할은 Chatbot과 동일합니다.
\ No newline at end of file