1. movielens data

https://grouplens.org/datasets/movielens/
To learn and train, use the minimum data set:
(ml-latest-small)[ https://files.grouplens.org/datasets/movielens/ml-latest-small.zip ]

2. logstash configuration file:

Copy a logstash-sample.conf file in the logstash/conf directory and name it: logstash-movies.conf. The contents are as follows:

 # Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

input {
  file {
    path => "/export/_backup/elk_bak/ml-latest-small/movies.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {
  csv {
    separator => ","
    columns => ["id", "content", "genre"]
  }
  
  mutate {
    split => { "genre" => "|"}
    remove_field => ["path", "host", "@timestamp", "message"]
  }
  
  mutate {
    split => { "content" => "(" }
    add_field => { "title" => "%{[content][0]}"}
    add_field => { "year" => "%{[content][1]}"}
  }
  
  mutate {
    convert => {
      "year" => "integer"
    }
    strip => ["title"]
    remove_field => ["path", "host", "@timestamp", "content"]
  }
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "movies"
    document_id => "%{id}"
    #user => "user"
    #password => "password"
  }
  stdout {}
}

3. Execute the import

bin/logstash -f config config/logstash-movies.conf
Execution takes a while!
Then the console output is as follows
 ......
{
          "id" => "193609",
       "genre" => [
        [0] "Comedy"
    ],
       "title" => "Andrew Dice Clay: Dice Rules",
    "@version" => "1",
        "year" => 1991
}

When the console is no longer output, ctrl+c can stop

4. kibana checks whether data is imported into index

The imported index appears in the index management, that is, success!
image.png


丰木
322 声望19 粉丝

遇见超乎想象的自己!