{"id":355,"date":"2020-03-17T13:23:09","date_gmt":"2020-03-17T12:23:09","guid":{"rendered":"http:\/\/justmakeit.es\/?p=355"},"modified":"2020-03-17T13:23:09","modified_gmt":"2020-03-17T12:23:09","slug":"tests-spark-vs-mongodb","status":"publish","type":"post","link":"http:\/\/justmakeit.es\/?p=355","title":{"rendered":"Tests Spark vs mongoDB"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"269\" src=\"http:\/\/justmakeit.es\/wp-content\/uploads\/2020\/03\/mongo_spark-1024x269.png\" alt=\"\" class=\"wp-image-357\" srcset=\"http:\/\/justmakeit.es\/wp-content\/uploads\/2020\/03\/mongo_spark-1024x269.png 1024w, http:\/\/justmakeit.es\/wp-content\/uploads\/2020\/03\/mongo_spark-300x79.png 300w, http:\/\/justmakeit.es\/wp-content\/uploads\/2020\/03\/mongo_spark-768x202.png 768w, http:\/\/justmakeit.es\/wp-content\/uploads\/2020\/03\/mongo_spark.png 1351w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Spark + mongoDB<\/figcaption><\/figure>\n\n\n\n<p>Recientemente he realizado un desarrollo que lee datos de un MongoDB y que luego los procesa con Spark tras parsear un Xml interno que tiene.<\/p>\n\n\n\n<p>El problema vino a la hora de intentar realizar los tests de cobertura del mismo, ya que Jenkins no me permit\u00eda descargar alguna de las librer\u00edas que existen para utilizar un MongoDb embebido, as\u00ed que tuve que probar varias hasta que d\u00ed con una que no necesitaba realizar ninguna descarga adicional.<\/p>\n\n\n\n<p>Incluyo aqu\u00ed la URL de acceso al repositorio del se\u00f1or que tan amablemente se ha currado estas librer\u00edas y que merece todo mi reconocimiento por la sencillez de uso que ha implementado\u2026<\/p>\n\n\n\n<p><a href=\"https:\/\/github.com\/bwaldvogel\/mongo-java-server\">https:\/\/github.com\/bwaldvogel\/mongo-java-server<\/a><\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>Para ello, hay que seguir los siguientes pasos:<\/p>\n\n\n\n<p>Incluir las siguientes dependencias en el fichero pom.xml<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><dependency>\n<groupId>de.bwaldvogel<\/groupId>\n<artifactId>mongo-java-server<\/artifactId>\n<version>1.24.0<\/version>\n<\/dependency>\n\n<dependency>\n<groupId>de.bwaldvogel<\/groupId>\n<artifactId>mongo-java-server-core<\/artifactId>\n<version>1.24.0<\/version>\n<\/dependency>\n\n<dependency>\n<groupId>de.bwaldvogel<\/groupId>\n<artifactId>mongo-java-server-memory-backend<\/artifactId>\n<version>1.24.0<\/version>\n<\/dependency><\/code><\/pre>\n\n\n\n<p>Creamos el test Java. <\/p>\n\n\n\n<p>Incluimos los import\u2026.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import java.io.IOException;\nimport java.net.InetSocketAddress;\nimport java.net.URL;\nimport java.nio.charset.StandardCharsets;\n\nimport org.apache.spark.sql.SparkSession;\nimport org.bson.Document;\nimport de.bwaldvogel.mongo.MongoServer;\nimport de.bwaldvogel.mongo.backend.memory.MemoryBackend;\nimport com.mongodb.client.MongoCollection;\nimport de.bwaldvogel.mongo.backend.memory.MemoryBackend<\/code><\/pre>\n\n\n\n<p>Luego el propio test\u2026<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>private MongoCollection<Document> collection;\nprivate MongoClient client;\nprivate MongoServer server;\n\n   @Test\n    final void testExecuteProcess() throws IOException {\n\n    server = new MongoServer(new MemoryBackend());\n   \n    \/\/ optionally: server.enableSsl(key, keyPassword, certificate);\n\n    \/\/ bind on a random local port\n    InetSocketAddress serverAddress = server.bind();\n\n   \n    spark = SparkSession.builder()\n      .master(\"local\")\/\/ only for debug\n      .appName(\"TESTS_mongo-db-motor\")\n      .config(\"spark.mongodb.input.uri\", \"mongodb:\/\/\"+\n      serverAddress.getHostName() + \/\/ nombre del host == localhost\n      \":\"+serverAddress.getPort()+\n      \"\/test\") \/\/ el \/test s\u00f3lo es necesario si hay autenticaci\u00f3n de usuarios\n      .config(\"spark.mongodb.input.database\", \"test\") \/\/ nombre de la base de datos ==  test\n      .config(\"spark.mongodb.input.collection\", \"test_collection\") \/\/ nombre de la collection ==  test_collection\n      .getOrCreate();    \n  \n    client = new MongoClient(new ServerAddress(serverAddress));\n    collection = client.getDatabase(\"test\").getCollection(\" test_collection \");\n\n    \/\/ el fichero test.json es un fichero recuperado del Mongo original\n    URL url = Resources.getResource(\"test.json\");\n    String text = Resources.toString(url, StandardCharsets.UTF_8);\n\n    \/\/ lo parseamos \n    Document doc = Document.parse(text);\n    \/\/ y lo incluimos en la collection\n    collection.insertOne(doc);\n\n    MyClass.executeProcess(spark, \"21-01-2020\");\n\t\t\n    client.close();\n    server.shutdown();<\/code><\/pre>\n\n\n\n<p>En este ejemplo el n\u00famero de puerto es aleatorio y el nombre del host ser\u00e1 localhost al ejecutarse localmente.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Recientemente he realizado un desarrollo que lee datos de un MongoDB y que luego los procesa con Spark tras parsear &hellip; <a href=\"http:\/\/justmakeit.es\/?p=355\" class=\"btn btn-readmore\">Read More <span class=\"screen-reader-text\"> \u00abTests Spark vs mongoDB\u00bb<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14],"tags":[42,41,34,43],"class_list":["post-355","post","type-post","status-publish","format-standard","hentry","category-programacion","tag-junit","tag-mongodb","tag-spark","tag-tests"],"_links":{"self":[{"href":"http:\/\/justmakeit.es\/index.php?rest_route=\/wp\/v2\/posts\/355","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/justmakeit.es\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/justmakeit.es\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/justmakeit.es\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/justmakeit.es\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=355"}],"version-history":[{"count":3,"href":"http:\/\/justmakeit.es\/index.php?rest_route=\/wp\/v2\/posts\/355\/revisions"}],"predecessor-version":[{"id":359,"href":"http:\/\/justmakeit.es\/index.php?rest_route=\/wp\/v2\/posts\/355\/revisions\/359"}],"wp:attachment":[{"href":"http:\/\/justmakeit.es\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=355"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/justmakeit.es\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=355"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/justmakeit.es\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=355"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}