{"id":374,"date":"2021-10-21T09:11:16","date_gmt":"2021-10-21T08:11:16","guid":{"rendered":"http:\/\/justmakeit.es\/?p=374"},"modified":"2021-10-21T09:11:16","modified_gmt":"2021-10-21T08:11:16","slug":"mas-cosas-con-datasets","status":"publish","type":"post","link":"http:\/\/justmakeit.es\/?p=374","title":{"rendered":"M\u00e1s cosas con Datasets"},"content":{"rendered":"\n<p>Voy a crear un <em>Dataset <\/em>vac\u00edo para ir acumulando en \u00e9l los resultados de distintas consultas que voy a ir realizando. Para ello utilizaremos el m\u00e9todo <em>.union()<\/em>, pero es necesario que el <em>Dataset <\/em>tenga el <em>schema <\/em>que van a tener los <em>Dataset <\/em>con los que se va a utilizar ese <em>.union()<\/em>.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">\/\/ Definimos los nombres de las columnas de nuestro Dataset\nString schemaString = \"ID CAMPO RULEID\";\r\n\r\nList<StructField> fields = new ArrayList<>();\r\n\r\n\/\/ Hacemos un split por el espacio con los nombres de las columnas y creamos un StructField con el nombre y en estos casos el tipo String\nfor (String fieldName : schemaString.split(\" \")) {\r\n    StructField field = DataTypes.createStructField(fieldName, DataTypes.StringType, true);\r\n    fields.add(field);\r\n}\r\n\r\n\/\/ Creamos el StructType con los StructField que acabamos de crear\nStructType schema = DataTypes.createStructType(fields);\r\n\n\/\/ Creamos el Dataset vac\u00edo con la estructura que acabamos de definir\t    \r\nDataset<Row> emptyDataSet = sparkSession.createDataFrame(new ArrayList<>(), schema);<\/pre>\n\n\n\n<p>Una vez creado el emptyDataSet, podemos empezar a usarlo\u2026<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">String query = \"SELECT ID,CAMPO FROM TBL WHERE....\";\nString nombreColumna = \"RULEID\";\n\/\/ Nuestro dataset que recuperamos de la ejecuci\u00f3n de la query s\u00f3lo tiene las columnas ID y CAMPO\nDataset temporal = sparkSession.sql(query);\n\n\/\/ Incluimos la informaci\u00f3n que necesitamos tener \ntemporal = temporal.withColumn(nombreColumna, functions.lit(valor));\n\n\/\/ Hacemos el union de los datasets para acumular datos\nemptyDataSet = emptyDataSet.union(temporal);<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Voy a crear un Dataset vac\u00edo para ir acumulando en \u00e9l los resultados de distintas consultas que voy a ir &hellip; <a href=\"http:\/\/justmakeit.es\/?p=374\" class=\"btn btn-readmore\">Read More <span class=\"screen-reader-text\"> \u00abM\u00e1s cosas con Datasets\u00bb<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5,14],"tags":[18,49,36,34],"class_list":["post-374","post","type-post","status-publish","format-standard","hentry","category-java","category-programacion","tag-big-data","tag-dataset","tag-java","tag-spark"],"_links":{"self":[{"href":"http:\/\/justmakeit.es\/index.php?rest_route=\/wp\/v2\/posts\/374","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/justmakeit.es\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/justmakeit.es\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/justmakeit.es\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/justmakeit.es\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=374"}],"version-history":[{"count":1,"href":"http:\/\/justmakeit.es\/index.php?rest_route=\/wp\/v2\/posts\/374\/revisions"}],"predecessor-version":[{"id":375,"href":"http:\/\/justmakeit.es\/index.php?rest_route=\/wp\/v2\/posts\/374\/revisions\/375"}],"wp:attachment":[{"href":"http:\/\/justmakeit.es\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=374"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/justmakeit.es\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=374"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/justmakeit.es\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=374"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}