import {
Array1D,
Graph,
Session,
NDArrayMathGPU,
} from 'deeplearn';
const math = new NDArrayMathGPU();
class ColorAccessibilityModel {
session;
inputTensor;
targetTensor;
predictionTensor;
costTensor;
...
prepareTrainingSet(trainingSet) {
math.scope(() => {
const { rawInputs, rawTargets } = trainingSet;
const inputArray = rawInputs.map(v => Array1D.new(this.normalizeColor(v)));
const targetArray = rawTargets.map(v => Array1D.new(v));
});
}
...
}
export default ColorAccessibilityModel;
第三,shuffle 輸入和目標(biāo)陣列。shuffle 的時(shí)候,deeplearn.js 提供的 shuffler 將二者保存在 sync 中。每次訓(xùn)練迭代都會(huì)出現(xiàn) shuffle,以饋送不同的輸入作為神經(jīng)網(wǎng)絡(luò)的 batch。整個(gè) shuffle 流程可以改善訓(xùn)練算法,因?yàn)樗赡芡ㄟ^(guò)避免過(guò)擬合來(lái)實(shí)現(xiàn)泛化。
import {
Array1D,
InCPUMemoryShuffledInputProviderBuilder,
Graph,
Session,
NDArrayMathGPU,
} from 'deeplearn';
const math = new NDArrayMathGPU();
class ColorAccessibilityModel {
session;
inputTensor;
targetTensor;
predictionTensor;
costTensor;
...
prepareTrainingSet(trainingSet) {
math.scope(() => {
const { rawInputs, rawTargets } = trainingSet;
const inputArray = rawInputs.map(v => Array1D.new(this.normalizeColor(v)));
const targetArray = rawTargets.map(v => Array1D.new(v));
const shuffledInputProviderBuilder = new InCPUMemoryShuffledInputProviderBuilder([
inputArray,
targetArray
]);
const [
inputProvider,
targetProvider,
] = shuffledInputProviderBuilder.getInputProviders();
});
}
...
}
export default ColorAccessibilityModel;
最后,饋送條目(feed entries)是訓(xùn)練階段中神經(jīng)網(wǎng)絡(luò)前饋算法的最終輸入。它匹配數(shù)據(jù)和張量(根據(jù)設(shè)置階段的形態(tài)而定義)。
import {
Array1D,
InCPUMemoryShuffledInputProviderBuilder
Graph,
Session,
NDArrayMathGPU,
} from 'deeplearn';
const math = new NDArrayMathGPU();
class ColorAccessibilityModel {
session;
inputTensor;
targetTensor;
predictionTensor;
costTensor;
feedEntries;
...
prepareTrainingSet(trainingSet) {
math.scope(() => {
const { rawInputs, rawTargets } = trainingSet;
const inputArray = rawInputs.map(v => Array1D.new(this.normalizeColor(v)));
const targetArray = rawTargets.map(v => Array1D.new(v));
const shuffledInputProviderBuilder = new InCPUMemoryShuffledInputProviderBuilder([
inputArray,
targetArray
]);
const [
inputProvider,
targetProvider,
] = shuffledInputProviderBuilder.getInputProviders();
this.feedEntries = [
{ tensor: this.inputTensor, data: inputProvider },
{ tensor: this.targetTensor, data: targetProvider },
];
});
}
...
}
export default ColorAccessibilityModel;
這樣,神經(jīng)網(wǎng)絡(luò)的設(shè)置就結(jié)束了。神經(jīng)網(wǎng)絡(luò)的所有層和單元都實(shí)現(xiàn)了,訓(xùn)練集也準(zhǔn)備好進(jìn)行訓(xùn)練了。現(xiàn)在只需要添加兩個(gè)配置神經(jīng)網(wǎng)絡(luò)行為的超參數(shù),它們適用于下個(gè)階段:訓(xùn)練階段。
import {
Array1D,
InCPUMemoryShuffledInputProviderBuilder,
Graph,
Session,
SGDOptimizer,
NDArrayMathGPU,
} from 'deeplearn';
const math = new NDArrayMathGPU();
class ColorAccessibilityModel {
session;
optimizer;
batchSize = 300;
initialLearningRate = 0.06;
inputTensor;
targetTensor;
predictionTensor;
costTensor;
feedEntries;
constructor() {
this.optimizer = new SGDOptimizer(this.initialLearningRate);
}
...
}
export default ColorAccessibilityModel;
第一個(gè)參數(shù)是學(xué)習(xí)速率(learning rate)。學(xué)習(xí)速率決定算法的收斂速度,以最小化成本。我們應(yīng)該假定它的數(shù)值很高,但實(shí)際上不能太高了。否則梯度下降就不會(huì)收斂,因?yàn)檎也坏骄植孔顑?yōu)值。
第二個(gè)參數(shù)是批尺寸(batch size)。它定義每個(gè) epoch(迭代)里有多少個(gè)訓(xùn)練集的數(shù)據(jù)點(diǎn)通過(guò)神經(jīng)網(wǎng)絡(luò)。一個(gè) epoch 等于一批數(shù)據(jù)點(diǎn)的一次正向傳播和一次反向傳播。以批次的方式訓(xùn)練神經(jīng)網(wǎng)絡(luò)有兩個(gè)好處:第一,這樣可以防止密集計(jì)算,因?yàn)樗惴ㄓ?xùn)練時(shí)使用了內(nèi)存中的少量數(shù)據(jù)點(diǎn);第二,這樣可以讓神經(jīng)網(wǎng)絡(luò)更快地進(jìn)行批處理,因?yàn)槊總€(gè) epoch 中權(quán)重會(huì)隨著每個(gè)批次的數(shù)據(jù)點(diǎn)進(jìn)行調(diào)整——而不是等到整個(gè)數(shù)據(jù)集訓(xùn)練完之后再進(jìn)行改動(dòng)。
訓(xùn)練階段
評(píng)論
查看更多