Skip to content

day-dreams/hw0

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Homework 0

Public repository and stub/testing code for Homework 0 of 10-714.

  • softmax
    • 最开始的实现版本,学习率过高导致loss反复波动不收敛。后来调整到一个较小的值会好很多
    • 其实是对比别人的代码发现学习速率实现的不一样。其实自己应该可以发现这个问题的
  • nurual network
    • 计算量大了不少,每次迭代非常慢。但是一开始就有较好的loss
    • loss很容易到0.05

同样是迭代100次,nurual network的表现比softmax强很多。说明“数字识别”这个任务,非常适合引入非线性函数(RELU)

Training softmax regression
| Epoch | Train Loss | Train Err | Test Loss | Test Err |
|     0 |    0.38625 |   0.10812 |   0.36690 |  0.09960 |
|     1 |    0.34486 |   0.09748 |   0.32926 |  0.09180 |
|     2 |    0.32663 |   0.09187 |   0.31376 |  0.08770 |
|     3 |    0.31572 |   0.08867 |   0.30504 |  0.08510 |
|     4 |    0.30822 |   0.08667 |   0.29940 |  0.08320 |
|     5 |    0.30264 |   0.08508 |   0.29543 |  0.08250 |
|     6 |    0.29825 |   0.08393 |   0.29247 |  0.08180 |
|     7 |    0.29466 |   0.08305 |   0.29017 |  0.08120 |
|     8 |    0.29166 |   0.08215 |   0.28832 |  0.08070 |
|     9 |    0.28908 |   0.08137 |   0.28680 |  0.08080 |
|    10 |    0.28682 |   0.08068 |   0.28552 |  0.08040 |
|    11 |    0.28483 |   0.08000 |   0.28443 |  0.08010 |
|    12 |    0.28305 |   0.07943 |   0.28349 |  0.07960 |
|    13 |    0.28143 |   0.07885 |   0.28266 |  0.07900 |
|    14 |    0.27997 |   0.07833 |   0.28193 |  0.07870 |
|    15 |    0.27862 |   0.07788 |   0.28128 |  0.07890 |
|    16 |    0.27738 |   0.07742 |   0.28070 |  0.07880 |
|    17 |    0.27624 |   0.07705 |   0.28017 |  0.07860 |
|    18 |    0.27517 |   0.07668 |   0.27969 |  0.07870 |
|    19 |    0.27417 |   0.07627 |   0.27925 |  0.07870 |
|    20 |    0.27323 |   0.07598 |   0.27885 |  0.07850 |
|    21 |    0.27235 |   0.07568 |   0.27848 |  0.07830 |
|    22 |    0.27151 |   0.07558 |   0.27814 |  0.07820 |
|    23 |    0.27073 |   0.07523 |   0.27783 |  0.07770 |
|    24 |    0.26998 |   0.07513 |   0.27753 |  0.07740 |
|    25 |    0.26927 |   0.07485 |   0.27726 |  0.07740 |
|    26 |    0.26859 |   0.07477 |   0.27701 |  0.07720 |
|    27 |    0.26795 |   0.07452 |   0.27677 |  0.07720 |
|    28 |    0.26733 |   0.07452 |   0.27655 |  0.07730 |
|    29 |    0.26674 |   0.07432 |   0.27634 |  0.07720 |
|    30 |    0.26617 |   0.07402 |   0.27614 |  0.07720 |
|    31 |    0.26563 |   0.07385 |   0.27596 |  0.07720 |
|    32 |    0.26510 |   0.07365 |   0.27579 |  0.07730 |
|    33 |    0.26460 |   0.07337 |   0.27562 |  0.07710 |
|    34 |    0.26412 |   0.07322 |   0.27547 |  0.07690 |
|    35 |    0.26365 |   0.07298 |   0.27533 |  0.07700 |
|    36 |    0.26320 |   0.07293 |   0.27519 |  0.07680 |
|    37 |    0.26276 |   0.07272 |   0.27506 |  0.07660 |
|    38 |    0.26234 |   0.07252 |   0.27494 |  0.07640 |
|    39 |    0.26193 |   0.07233 |   0.27482 |  0.07630 |
|    40 |    0.26153 |   0.07218 |   0.27471 |  0.07640 |
|    41 |    0.26114 |   0.07218 |   0.27461 |  0.07640 |
|    42 |    0.26077 |   0.07197 |   0.27451 |  0.07620 |
|    43 |    0.26041 |   0.07182 |   0.27441 |  0.07610 |
|    44 |    0.26006 |   0.07170 |   0.27432 |  0.07600 |
|    45 |    0.25971 |   0.07165 |   0.27424 |  0.07600 |
|    46 |    0.25938 |   0.07150 |   0.27416 |  0.07590 |
|    47 |    0.25905 |   0.07145 |   0.27409 |  0.07580 |
|    48 |    0.25874 |   0.07128 |   0.27401 |  0.07560 |
|    49 |    0.25843 |   0.07123 |   0.27395 |  0.07550 |
|    50 |    0.25813 |   0.07105 |   0.27388 |  0.07560 |
|    51 |    0.25784 |   0.07098 |   0.27382 |  0.07580 |
|    52 |    0.25755 |   0.07083 |   0.27376 |  0.07590 |
|    53 |    0.25727 |   0.07068 |   0.27371 |  0.07580 |
|    54 |    0.25700 |   0.07050 |   0.27365 |  0.07600 |
|    55 |    0.25673 |   0.07042 |   0.27361 |  0.07610 |
|    56 |    0.25647 |   0.07025 |   0.27356 |  0.07600 |
|    57 |    0.25621 |   0.07005 |   0.27351 |  0.07600 |
|    58 |    0.25596 |   0.06992 |   0.27347 |  0.07590 |
|    59 |    0.25572 |   0.06985 |   0.27343 |  0.07580 |
|    60 |    0.25548 |   0.06985 |   0.27340 |  0.07570 |
|    61 |    0.25524 |   0.06982 |   0.27336 |  0.07560 |
|    62 |    0.25501 |   0.06977 |   0.27333 |  0.07550 |
|    63 |    0.25478 |   0.06972 |   0.27330 |  0.07560 |
|    64 |    0.25456 |   0.06965 |   0.27327 |  0.07560 |
|    65 |    0.25435 |   0.06958 |   0.27324 |  0.07550 |
|    66 |    0.25413 |   0.06952 |   0.27321 |  0.07540 |
|    67 |    0.25392 |   0.06940 |   0.27319 |  0.07550 |
|    68 |    0.25372 |   0.06935 |   0.27317 |  0.07550 |
|    69 |    0.25352 |   0.06932 |   0.27315 |  0.07540 |
|    70 |    0.25332 |   0.06922 |   0.27313 |  0.07520 |
|    71 |    0.25312 |   0.06923 |   0.27311 |  0.07510 |
|    72 |    0.25293 |   0.06915 |   0.27309 |  0.07510 |
|    73 |    0.25274 |   0.06907 |   0.27308 |  0.07520 |
|    74 |    0.25256 |   0.06908 |   0.27306 |  0.07520 |
|    75 |    0.25238 |   0.06903 |   0.27305 |  0.07530 |
|    76 |    0.25220 |   0.06890 |   0.27304 |  0.07530 |
|    77 |    0.25202 |   0.06892 |   0.27303 |  0.07520 |
|    78 |    0.25185 |   0.06887 |   0.27302 |  0.07520 |
|    79 |    0.25168 |   0.06878 |   0.27301 |  0.07500 |
|    80 |    0.25151 |   0.06875 |   0.27300 |  0.07500 |
|    81 |    0.25135 |   0.06875 |   0.27300 |  0.07490 |
|    82 |    0.25118 |   0.06865 |   0.27299 |  0.07500 |
|    83 |    0.25102 |   0.06855 |   0.27299 |  0.07490 |
|    84 |    0.25087 |   0.06847 |   0.27298 |  0.07500 |
|    85 |    0.25071 |   0.06847 |   0.27298 |  0.07510 |
|    86 |    0.25056 |   0.06840 |   0.27298 |  0.07500 |
|    87 |    0.25041 |   0.06837 |   0.27298 |  0.07500 |
|    88 |    0.25026 |   0.06837 |   0.27298 |  0.07540 |
|    89 |    0.25011 |   0.06835 |   0.27298 |  0.07530 |
|    90 |    0.24997 |   0.06827 |   0.27298 |  0.07550 |
|    91 |    0.24982 |   0.06820 |   0.27298 |  0.07540 |
|    92 |    0.24968 |   0.06813 |   0.27298 |  0.07540 |
|    93 |    0.24954 |   0.06808 |   0.27299 |  0.07540 |
|    94 |    0.24941 |   0.06808 |   0.27299 |  0.07540 |
|    95 |    0.24927 |   0.06802 |   0.27300 |  0.07540 |
|    96 |    0.24914 |   0.06793 |   0.27300 |  0.07540 |
|    97 |    0.24900 |   0.06790 |   0.27301 |  0.07540 |
|    98 |    0.24887 |   0.06788 |   0.27301 |  0.07550 |
|    99 |    0.24875 |   0.06778 |   0.27302 |  0.07550 |

Training two layer neural network w/ 100 hidden units
| Epoch | Train Loss | Train Err | Test Loss | Test Err |
|     0 |    0.24180 |   0.07212 |   0.23828 |  0.06970 |
|     1 |    0.18094 |   0.05412 |   0.18455 |  0.05370 |
|     2 |    0.14927 |   0.04420 |   0.15777 |  0.04620 |
|     3 |    0.12775 |   0.03745 |   0.14044 |  0.04180 |
|     4 |    0.11274 |   0.03267 |   0.12908 |  0.03840 |
|     5 |    0.10125 |   0.02967 |   0.12084 |  0.03640 |
|     6 |    0.09205 |   0.02687 |   0.11450 |  0.03470 |
|     7 |    0.08421 |   0.02435 |   0.10920 |  0.03330 |
|     8 |    0.07765 |   0.02233 |   0.10508 |  0.03150 |
|     9 |    0.07219 |   0.02072 |   0.10218 |  0.03020 |
|    10 |    0.06762 |   0.01913 |   0.09990 |  0.03010 |
|    11 |    0.06319 |   0.01770 |   0.09718 |  0.02950 |
|    12 |    0.05961 |   0.01652 |   0.09544 |  0.02900 |
|    13 |    0.05630 |   0.01535 |   0.09385 |  0.02920 |
|    14 |    0.05325 |   0.01442 |   0.09220 |  0.02890 |
|    15 |    0.05052 |   0.01373 |   0.09090 |  0.02810 |
|    16 |    0.04797 |   0.01300 |   0.08961 |  0.02760 |
|    17 |    0.04554 |   0.01230 |   0.08860 |  0.02670 |
|    18 |    0.04336 |   0.01155 |   0.08776 |  0.02610 |
|    19 |    0.04147 |   0.01105 |   0.08711 |  0.02570 |
|    20 |    0.03961 |   0.01030 |   0.08649 |  0.02550 |
|    21 |    0.03785 |   0.01002 |   0.08608 |  0.02540 |
|    22 |    0.03622 |   0.00945 |   0.08522 |  0.02540 |
|    23 |    0.03459 |   0.00892 |   0.08464 |  0.02520 |
|    24 |    0.03326 |   0.00843 |   0.08435 |  0.02510 |
|    25 |    0.03195 |   0.00805 |   0.08412 |  0.02510 |
|    26 |    0.03061 |   0.00770 |   0.08374 |  0.02480 |
|    27 |    0.02956 |   0.00730 |   0.08360 |  0.02460 |
|    28 |    0.02839 |   0.00697 |   0.08339 |  0.02520 |
|    29 |    0.02736 |   0.00650 |   0.08338 |  0.02530 |
|    30 |    0.02628 |   0.00600 |   0.08299 |  0.02500 |
|    31 |    0.02530 |   0.00582 |   0.08285 |  0.02510 |
|    32 |    0.02434 |   0.00542 |   0.08275 |  0.02510 |
|    33 |    0.02350 |   0.00515 |   0.08279 |  0.02520 |
|    34 |    0.02262 |   0.00485 |   0.08264 |  0.02500 |
|    35 |    0.02178 |   0.00470 |   0.08249 |  0.02510 |
|    36 |    0.02108 |   0.00453 |   0.08259 |  0.02470 |
|    37 |    0.02033 |   0.00428 |   0.08248 |  0.02490 |
|    38 |    0.01954 |   0.00402 |   0.08224 |  0.02460 |
|    39 |    0.01880 |   0.00388 |   0.08199 |  0.02430 |
|    40 |    0.01821 |   0.00373 |   0.08217 |  0.02420 |
|    41 |    0.01756 |   0.00352 |   0.08187 |  0.02420 |
|    42 |    0.01685 |   0.00338 |   0.08170 |  0.02360 |
|    43 |    0.01630 |   0.00325 |   0.08173 |  0.02360 |
|    44 |    0.01573 |   0.00307 |   0.08148 |  0.02350 |
|    45 |    0.01522 |   0.00282 |   0.08155 |  0.02360 |
|    46 |    0.01463 |   0.00262 |   0.08130 |  0.02330 |
|    47 |    0.01422 |   0.00250 |   0.08137 |  0.02370 |
|    48 |    0.01369 |   0.00230 |   0.08127 |  0.02350 |
|    49 |    0.01324 |   0.00215 |   0.08128 |  0.02350 |
|    50 |    0.01278 |   0.00198 |   0.08114 |  0.02360 |
|    51 |    0.01240 |   0.00185 |   0.08116 |  0.02340 |
|    52 |    0.01197 |   0.00167 |   0.08125 |  0.02360 |
|    53 |    0.01165 |   0.00162 |   0.08115 |  0.02350 |
|    54 |    0.01124 |   0.00150 |   0.08111 |  0.02340 |
|    55 |    0.01087 |   0.00140 |   0.08112 |  0.02340 |
|    56 |    0.01056 |   0.00132 |   0.08120 |  0.02330 |
|    57 |    0.01019 |   0.00117 |   0.08122 |  0.02320 |
|    58 |    0.00988 |   0.00108 |   0.08120 |  0.02310 |
|    59 |    0.00964 |   0.00112 |   0.08136 |  0.02310 |
|    60 |    0.00927 |   0.00092 |   0.08120 |  0.02300 |
|    61 |    0.00903 |   0.00087 |   0.08123 |  0.02290 |
|    62 |    0.00870 |   0.00070 |   0.08124 |  0.02320 |
|    63 |    0.00850 |   0.00068 |   0.08124 |  0.02280 |
|    64 |    0.00820 |   0.00063 |   0.08141 |  0.02310 |
|    65 |    0.00799 |   0.00062 |   0.08129 |  0.02300 |
|    66 |    0.00774 |   0.00057 |   0.08135 |  0.02280 |
|    67 |    0.00752 |   0.00050 |   0.08148 |  0.02250 |
|    68 |    0.00732 |   0.00045 |   0.08151 |  0.02230 |
|    69 |    0.00710 |   0.00043 |   0.08154 |  0.02250 |
|    70 |    0.00691 |   0.00042 |   0.08159 |  0.02210 |
|    71 |    0.00671 |   0.00035 |   0.08168 |  0.02220 |
|    72 |    0.00655 |   0.00035 |   0.08169 |  0.02220 |
|    73 |    0.00636 |   0.00030 |   0.08190 |  0.02210 |
|    74 |    0.00620 |   0.00030 |   0.08193 |  0.02210 |
|    75 |    0.00603 |   0.00027 |   0.08196 |  0.02190 |
|    76 |    0.00590 |   0.00022 |   0.08212 |  0.02210 |
|    77 |    0.00574 |   0.00020 |   0.08211 |  0.02210 |
|    78 |    0.00560 |   0.00018 |   0.08225 |  0.02190 |
|    79 |    0.00546 |   0.00018 |   0.08227 |  0.02200 |
|    80 |    0.00534 |   0.00017 |   0.08244 |  0.02190 |
|    81 |    0.00521 |   0.00012 |   0.08257 |  0.02210 |
|    82 |    0.00510 |   0.00012 |   0.08261 |  0.02200 |
|    83 |    0.00497 |   0.00012 |   0.08269 |  0.02200 |
|    84 |    0.00484 |   0.00010 |   0.08278 |  0.02200 |
|    85 |    0.00476 |   0.00010 |   0.08292 |  0.02210 |
|    86 |    0.00464 |   0.00008 |   0.08299 |  0.02190 |
|    87 |    0.00455 |   0.00007 |   0.08310 |  0.02190 |
|    88 |    0.00445 |   0.00007 |   0.08321 |  0.02170 |
|    89 |    0.00437 |   0.00007 |   0.08329 |  0.02210 |
|    90 |    0.00427 |   0.00005 |   0.08335 |  0.02200 |
|    91 |    0.00418 |   0.00003 |   0.08345 |  0.02180 |
|    92 |    0.00410 |   0.00003 |   0.08364 |  0.02190 |
|    93 |    0.00402 |   0.00003 |   0.08366 |  0.02180 |
|    94 |    0.00393 |   0.00003 |   0.08369 |  0.02200 |
|    95 |    0.00386 |   0.00003 |   0.08390 |  0.02200 |
|    96 |    0.00379 |   0.00003 |   0.08401 |  0.02200 |
|    97 |    0.00372 |   0.00003 |   0.08405 |  0.02210 |
|    98 |    0.00365 |   0.00002 |   0.08414 |  0.02190 |
|    99 |    0.00357 |   0.00002 |   0.08421 |  0.02190 |

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 70.8%
  • Python 25.8%
  • C++ 3.1%
  • Makefile 0.3%