前言
本文任务、数据集来源于《动手学深度学习》课程中的树叶分类竞赛。详见参考资料1
本文代码主要参考kaggle用户nekokiku给出的resnet baseline。详见参考资料2
通过本次竞赛和代码可以学习到:
简单的PyTorch深度学习项目代码、结构是怎样的
如何继承PyTorch中的Dataset, DataLoader实现自己的Dataset, DataLoader类
如何使用PyTorch快速实现ResNet模型。
流程 分析baseline code,总结一下深度学习的代码流程。
处理数据:包括元数据和图像数据。比如看看数据的样子,标签的分布,unique 标签的数量等。
实现自己的Dataset、DataLoader类
CPU OR GPU
定义模型
定义超参数:学习率等
train&valid
使用训练好的模型进行预测。
结构 针对本次任务项目结构如下:
leaves_classification_competition/ └─data/ └─classify-leaves/ └─images/ └─train.csv └─test.csv └─script.ipynb
data/目录下是本次任务用到的数据。将从kaggle上下载的数据压缩包解压到本文件夹。
script.ipynb是代码脚本。
在更大的训练中,代码应该拆分为各种.py文件。
baseline 导入包 import torchimport torch.nn as nnimport pandas as pdimport numpy as npfrom torch.utils.data import Dataset, DataLoaderfrom torchvision import transformsfrom PIL import Imageimport osimport matplotlib.pyplot as pltimport torchvision.models as modelsfrom tqdm import tqdmimport seaborn as sns
数据处理 看一看train.csv长什么样
DATA_BASE_PATH = './data/classify-leaves/' labels_df = pd.read_csv(os.path.join(DATA_BASE_PATH, 'train.csv' )) labels_df.head()
leaves_labels = sorted (list (set (labels_df['label' ]))) n_classes = len (leaves_labels) class2num = dict (zip (leaves_labels, range (n_classes))) class2num
{'abies_concolor': 0, 'abies_nordmanniana': 1, 'acer_campestre': 2, …… 'zelkova_serrata': 175}
num2class = {v:k for k, v in class2num.items()} num2class
{0: 'abies_concolor', 1: 'abies_nordmanniana', 2: 'acer_campestre', 3: 'acer_ginnala', 4: 'acer_griseum', …… 175: 'zelkova_serrata'}
实现Dataset 继承Dataset类,实现自己的Dataset。继承之后,需要实现三个函数
__init__:传入必要的参数,初始化;
__getitem__:返回item,如果是train或者valid则返回img和label,如果是test则返回img
__len__:数据集长度
更多参见参考资料3
class LeaveDataSet (Dataset ): def __init__ (self, csv_path, img_path, mode='train' , valid_ratio = 0.2 , resize_height = 256 , resize_width = 256 ): """ Args: csv_path: label文件路径 img_path: 图片存放路径 mode: 训练模式还是测试模式 valid_ratio: 验证集比例 """ self.img_path = img_path self.mode = mode self.data_info = pd.read_csv(csv_path) self.data_len = len (self.data_info) self.train_len = int (self.data_len * (1 - valid_ratio)) if mode == 'train' : self.train_image = np.asarray(self.data_info.iloc[:self.train_len, 0 ]) self.train_label = np.asarray(self.data_info.iloc[:self.train_len, 1 ]) self.image_arr = self.train_image self.label_arr = self.train_label elif mode == 'valid' : self.valid_image = np.asarray(self.data_info.iloc[self.train_len:,0 ]) self.valid_label = np.asarray(self.data_info.iloc[self.train_len:,1 ]) self.image_arr = self.valid_image self.label_arr = self.valid_label else : self.test_image = np.asarray(self.data_info.iloc[:,0 ]) self.image_arr = self.test_image self.real_len = len (self.image_arr) print ("Finished reading the {} set of Leaves Dataset. ({} samples found)" .format (mode, self.real_len)) def __getitem__ (self, index ): single_image_name = self.image_arr[index] img_as_img = Image.open (os.path.join(self.img_path, single_image_name)) if self.mode == 'train' : transform = transforms.Compose([ transforms.Resize((224 , 224 )), transforms.RandomHorizontalFlip(p=0.5 ), transforms.ToTensor() ]) else : transform = transforms.Compose([ transforms.Resize((224 , 224 )), transforms.ToTensor() ]) img_as_img = transform(img_as_img) if self.mode == 'test' : return img_as_img else : label = self.label_arr[index] number_label = class2num[label] return img_as_img, number_label def __len__ (self ): return self.real_len
''' 生成dataset对象 ''' train_csv_path = './data/classify-leaves/train.csv' test_csv_path = './data/classify-leaves/test.csv' img_path = './data/classify-leaves/' train_dataset = LeaveDataSet(train_csv_path, img_path, mode='train' ) test_dataset = LeaveDataSet(train_csv_path, img_path, mode='test' ) valid_dataset = LeaveDataSet(train_csv_path, img_path, mode='valid' )
Finished reading the train set of Leaves Dataset (14681 samples found) Finished reading the test set of Leaves Dataset (18353 samples found) Finished reading the valid set of Leaves Dataset (3672 samples found)
实现Dataloader 定义train,valid,test的dataloader,参数详见参考资料4
train_loader = DataLoader( dataset=train_dataset, batch_size=8 , shuffle=False , num_workers=5 ) valid_loader = DataLoader( dataset=valid_dataset, batch_size=8 , shuffle=False , num_workers=5 ) test_loader = DataLoader( dataset=test_dataset, batch_size=8 , shuffle=False , num_workers=5 )
获取GPU def get_device (): return 'cuda' if torch.cuda.is_available() else 'cpu' device = get_device() print (device)
定义模型 这里模型使用resnet34
''' num_classes: 传入分类任务的类别数,由上面的数据可知,一共有176种树叶,所以一会调用该函数返回模型时,传入的num_classes参数为176 use_pretrained: 是否使用预训练模型。 ''' def res_model (num_classes, feature_extract = False , use_pretrained = False ): model_ft = models.resnet34(pretrained=use_pretrained) num_ftrs = model_ft.fc.in_features model_ft.fc = nn.Sequential(nn.Linear(num_ftrs, num_classes)) return model_ft
定义超参数 lr = 3e-4 weight_decay = 1e-3 num_epoch = 50 model_path = './pre_res_model.ckpt'
train&valid model = res_model(176 ) model = model.to(device) model.device = device criterion = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters(), lr = lr, weight_decay= weight_decay) n_epochs = num_epoch best_acc = 0.0 for epoch in range (n_epochs): model.train() train_loss = [] train_accs = [] for batch in tqdm(train_loader): imgs, labels = batch imgs = imgs.to(device) labels = labels.to(device) logits = model(imgs) loss = criterion(logits, labels) optimizer.zero_grad() loss.backward() optimizer.step() acc = (logits.argmax(dim = -1 ) == labels).float ().mean() train_loss.append(loss.item()) train_accs.append(acc) train_loss = sum (train_loss) / len (train_loss) train_acc = sum (train_accs) / len (train_accs) print (f"[ Train | {epoch + 1 :03d} /{n_epochs:03d} ] loss = {train_loss:.5 f} , acc = {train_acc:.5 f} " ) model.eval () valid_loss = [] valid_accs = [] for batch in tqdm(valid_loader): imgs, labels = batch with torch.no_grad(): logits = model(imgs.to(device)) loss = criterion(logits, labels.to(device)) acc = (logits.argmax(dim = -1 ) == labels.to(device)).float ().mean() valid_loss.append(loss.item()) valid_accs.append(acc) valid_loss = sum (valid_loss) / len (valid_loss) valid_acc = sum (valid_accs) / len (valid_accs) print (f"[ Valid | {epoch + 1 :03d} /{n_epochs:03d} ] loss = {valid_loss:.5 f} , acc = {valid_acc:.5 f} " ) if valid_acc > best_acc: best_acc = valid_acc torch.save(model.state_dict(), model_path) print ('saving model with acc {:.3f}' .format (best_acc))
//控制台输出
预测 上面就是训练过程,下面用训练好的模型进行预测
saveFileName = './data/classify-leaves/submission.csv'
model = res_model(176 ) model = model.to(device) model.load_state_dict(torch.load(model_path)) model.eval () predictions = [] for batch in tqdm(test_loader): imgs = batch with torch.no_grad(): logits = model(imgs.to(device)) predictions.extend(logits.argmax(dim=-1 ).cpu().numpy().tolist()) preds = [] for i in predictions: preds.append(num2class[i]) test_data = pd.read_csv(test_path) test_data['label' ] = pd.Series(preds) submission = pd.concat([test_data['image' ], test_data['label' ]], axis=1 ) submission.to_csv(saveFileName, index=False ) print ("Done!!!!!!!!!!!!!!!!!!!!!!!!!!!" )
改进模型 //TODO
反思 初学深度学习,代码上犯了很多细节上的错误。其次,很多细节问题不明白,例如,优化器?各种计算公式等。以后要弄懂。
参考资料
第二部分完结竞赛:图片分类【动手学深度学习v2】
nekokiku/simple-resnet-baseline
Pytorch中的dataset类——创建适应任意模型的数据集接口
Pytorch 中的数据类型 torch.utils.data.DataLoader 参数详解