Data Collection-checkpoint.ipynb 3.52 KB
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 데이터 수집\n",
    "데이터 : 코스피200 지수 종가/오픈/고가/저가/거래량 <br>\n",
    "수집 출처 : investing.com\n",
    "\n",
    "|<center>**Data Set**</center>|<center>**Period**</center>|<center>**Number of Data**</center>|\n",
    "|------|---|:---|\n",
    "|Training Data|2005/01/03 ~ 2018/12/28|3464|\n",
    "|Test Data|2019/01/02 ~ 2019/12/30|246|\n",
    "|Validation Data|2020/01/02 ~ 2020/09/29|188|"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "ename": "ModuleNotFoundError",
     "evalue": "No module named 'date'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mModuleNotFoundError\u001b[0m                       Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-2-7e78e5e4da43>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m      3\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mpandas\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mpd\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      4\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mtqdm\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mtqdm\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0;32mimport\u001b[0m \u001b[0mdate\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'date'"
     ]
    }
   ],
   "source": [
    "# module\n",
    "\n",
    "import pandas as pd\n",
    "from tqdm import tqdm\n",
    "import date"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_data = pd.read_csv('/Users/yangyoonji/Documents/2020_2학기/캡스톤디자인/data/train_data.csv')\n",
    "test_data = pd.read_csv('/Users/yangyoonji/Documents/2020_2학기/캡스톤디자인/data/test_data.csv')\n",
    "validation_data = pd.read_csv('/Users/yangyoonji/Documents/2020_2학기/캡스톤디자인/data/validation_data.csv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for i in tqdm(range(len(train_data))):\n",
    "    train_data['date'][i] = train_data['date'][i].replace(' ','-') "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_data.head(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for i in range(len(train_data)):\n",
    "    train_data['date'][i] = datetime.datetime.strptime(train_data['date'][i],\"%Y-%m-%d\").date()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_data.head(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "validation_data.head(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}