Our sample statistics get closer to the genuine values as the dataset size increases. Therefore, the main benefit of CV, which averages out the variances of random train/val splits, decreases as the cost of executing it increases because, in the worst scenario, a 5-fold split essentially requires you to build your model five times.
Nothing is lacking from your situation. Minimal datasets are intended for cross validation. For larger ones, you can set aside 5% of the total data as a hold-out set in case you need to adjust the hyperparameters.
Using the cross-validation set, you adjust your hyperparameters (such as the number of layers and learning rate). You can then verify that your hyperparameters aren’t overtrained in relation to your CV set using the test set.
It’s comparable to the original reason for having a train/test divide. You must ensure that the parameters you’ve optimized for a particular piece of data aren’t skewed or overtrained. To make sure you haven’t overtrained and that you will obtain comparable results for real-world data, even if you have tuned your hyperparameters to perform well on both your train and test sets, you need an additional set of data.
You don’t need a CV set if, for whatever reason, you already know what architecture you’re using, what your learning rate is, etc.