Towards an automated clustering for online news events: A method proposal and data set for further development