It is known that successful load disaggregation via deep learning relies on a large number of labeled data to train the deep neural networks. However, it is hard and expensive to acquire a large amount of appliance-level power data or ON/OFF labels. For overcoming such weakness, in this paper, unsupervised pre-training is applied to the state-of-the-art sequence-to-point (s2p) deep learning approach for NILM, where labeling is not required in pre-training. In the proposed method, the s2p deep neural network is initially pre-trained on unlabeled aggregate power readings for other houses, and then fine-tuned on a small set of aggregate power for the target house labeled by individual appliance monitoring. Finally, the generated network is applied to testing with the aggregate for the target house as input and outputting the power signal for the target load. The proposed method is validated on the UK REFIT dataset1, benchmark with s2p in two popular evaluation metrics. Experimental results show that the proposed unsupervised pre-training effectively improves NILM performance of the deep neural network with a lack of labeled training data.